A little over a month after launching its most capable text-to-image model, Ideogram has pushed an update introducing several new capabilities to the AI, including description-based referencing and negative prompting.

Available on Ideogram’s web platform, the features are designed to give users more control over how they produce images as well as enhance the overall quality and coherence of the outputs. It strengthens the service, marking another notable step to close in on the performance of rival offerings in the image generation space, including Midjourney and DALL-E.

The features can be tested right away, although not all of them are available to those using the free version of the platform.

What’s new in Ideogram?

When Ideogram launched version 1.0 of its model in February, users were treated with a magic prompt feature to expand and detail inputs given by the user. Now, building this work, the company has introduced a new Describe capability that generates descriptions or captions from reference images.

Essentially, a user can now take an Ideogram-generated public image or upload their own to have a text-based description of the image. This content can then be used as a prompt to generate a very similar image. If needed, users can also make alterations to the generated description to modify the output according to their needs.

But there’s more.

In addition to descriptions for reference images, Ideogram is adding negative prompting as well as the option to choose between Fast, Default or Quality modes on the platform. The former, as the name suggests, will allow users to give negative prompts and tell the model what they do not want to see in the output. It has been designed to let users remove certain objects or tailor the style of the generations. 

Meanwhile, the latter will allow users to control how quickly the output is generated. The fast mode, Ideogram says, will produce an image in about five seconds, with very basic quality, while the quality model will focus on photorealism and details but take about 20 seconds. The default mode will sit between the two, balancing for both aspects and taking about 12 seconds.

While it remains to be seen how many users actually use these modes, Ideogram says users can use these options to quickly generate a basic image and then iterate on it for high-quality results. 

Improved photorealism and text rendering

Finally, Ideogram also said it is enhancing text rendering with the latest update, reducing error rates further by 15%. This isn’t a big change, but the company says it performs better than DALL-3 Vivid when producing characters and words. 

Ideogram did not share stats comparing the upgraded model with Midjourney, which leads in the AI image generation category. However, it did claim that the model delivers enhanced image coherence and photorealism in the outputs and is preferred over the last version by human raters. 

“Human raters prefer images generated by the upgraded model 30-50% more than the prior version in prompt alignment, image coherence, and text rendering quality,” the company, which has roped in over seven million creators since launching public beta last year, wrote in a blog post.

Currently, negative prompting and the new speed modes are restricted to users paying for Ideogram’s Basic and Plus plans. There’s no clarity on the availability of reference image captioning, although we suspect it might be free as it loosely matches the Remix feature the company has on offer to let users generate images similar to existing reference images. The text and image coherence enhancements are also available to users across the board.