Midjourney debuts consistent characters for gen AI images

March 12, 2024

2069 Views 0

SaveSavedRemoved 0

Midjourney debuts consistent characters for gen AI images

Join leaders in Boston on March 27 for an exclusive night of networking, insights, and conversation. Request an invite here.

The popular AI image generating service Midjourney has deployed one of its most oft-requested features: the ability to recreate characters consistently across new images.

This has been a major hurdle for AI image generators to-date, by their very nature.

That’s because most AI image generators rely on “diffusion models,” tools similar to or based on Stability AI’s Stable Diffusion open-source image generation algorithm, which work roughly by taking text inputted by a user and trying to piece together an image pixel-by-pixel that matches that description, as learned from similar imagery and text tags in their massive (and controversial) training data set of millions of human created images.

Why consistent characters are so powerful — and elusive — for generative AI imagery

Yet, as is the case with text-based large language models (LLMs) such as OpenAI’s ChatGPT or Cohere’s new Command-R, the problem with all generative AI applications is in their inconsistency of responses: the AI generates something new for every single prompt entered into it, even if the prompt is repeated or some of the same key words are used.

VB Event

The AI Impact Tour – Boston

We’re excited for the next stop on the AI Impact Tour in Boston on March 27th. This exclusive, invite-only event, in partnership with Microsoft, will feature discussions on best practices for data integrity in 2024 and beyond. Space is limited, so request an invite today.

Request an invite

This is great for generating whole new pieces of content — in the case of Midjourney, images. But what if you’re storyboarding a film, a novel, a graphic novel or comic book, or some other visual medium where you want the same character or characters to move through it and appear in different scenes, settings, with different facial expressions and props?

This exact scenario, which is typically necessary for narrative continuity, has been very difficult to achieve with generative AI — so far. But Midjourney is now taking a crack at it, introducing a new tag, “–cref” (short for “character reference”) that users can add to the end of their text prompts in the Midjourney Discord and will try to match the character’s facial features, body type, and even clothing from a URL that the user pastes in following said tag.

As the feature progresses and is refined, it could take Midjourney further from being a cool toy or ideation source into more of a professional tool.

How to use the new Midjourney consistent character feature

The tag works best with previously generated Midjourney images. So, for example, the workflow for a user would be to first generate or retrieve the URL of a previously generated character.

Let’s start from scratch and say we are generating a new character with this prompt: “a muscular bald man with a bead and eye patch.”