I heard a lot about the new image generation models last week. So, I tested to see what’s improved. I gave the prompt below to various image generation models — old and new.
A Calvin and Hobbes strip. Calvin is boxing Hobbes, with a dialog bubble from Calvin, saying “Bring it on!”
Stable Diffusion XL Lightning
Stable Diffusion XL Base
Dall-E API
Runway ML
ImageGen 3
Dall-E 3 API
Ideogram 2.0
Flux.dev via Fal.ai
ChatGPT Plus
A few observations:
- Text generation has come a long way. The newer models have little problem generating clear text.
- Flux.1 seems to be the better of the newly released models
- But OpenAI’s ChatGPT seems to create as good an output as Flux.1
On the last point, it’s noteworthy that Dall-E-3 (the engine behind ChatGPT) gives a poor result. Clearly, prompting makes a difference. Here’s how ChatGPT modified my prompt to Dall-E-3.
A comic strip style image featuring Calvin, a young boy with spiky hair, standing in a playful boxing stance with oversized boxing gloves. He looks determined as he says ‘Bring it on!’ in a speech bubble. Facing him is Hobbes, a tall and slightly bemused tiger, also in a mock boxing pose with a gentle smile, as if humoring Calvin. The scene is set in Calvin’s backyard, typical of a Calvin and Hobbes comic, with a simple and uncluttered backdrop.
But just as clearly, prompting is far from the primary driver. Here’s the result of the above prompt on the Dall-E 3 API. The model ChatGPT is using behind the scenes seems to be a significant improvement over Dall-E 3.
The same detailed prompt does extremely well on ImageGen 3, though.
Update: 6 Oct 2024. Here’s what I get with meta.ai.
Update: 8 Oct 2024. Here’s what I got with Flux 1.1 Pro with the short prompt. (The detailed prompt gave me an error: “NSFW content detected in image. Try running it again, or try a different prompt.”)
The Calvin cartoon generated by ChatGPT is eerily close to the original!
It’s amazing to see what these bots can create when you ask them to bring a funny idea to life.
My last experiment brainstorming with the bots – https://mvark.blogspot.com/2024/08/let-ai-handle-overthinking.html