OpenAI has provided an early glimpse into DALL·E 3, the latest iteration of its renowned image generation tool. This new version promises to deliver images that are more in line with user queries, emphasizing its enhanced capability to understand and interpret prompts. The announcement of the new model comes after information about it was recently leaked online.
DALL-E is an image-generating AI that was co-developed by OpenAI and Microsoft. Redmond provided an Azure-powered supercomputer to create the AI. This was the same computing system that built the GPT AI engine, which is now up to GPT-4 and powering services such as Bing Chat and Microsoft 365 Copilot. DALL-E is also a part of Microsoft's Bing Image Creator image searching/creation AI.
Key Features and Improvements
DALL·E 3 stands out for its significant advancements in understanding the nuances of prompts, especially the longer ones. It has shown marked improvement over its predecessor, DALL·E 2, which was introduced in April 2022.
Our new text-to-image model, DALL·E 3, can translate nuanced requests into extremely detailed and accurate images.
Coming soon to ChatGPT Plus & Enterprise, which can help you craft amazing prompts to bring your ideas to life:https://t.co/jDXHGNmarT pic.twitter.com/aRWH5giBPL
— OpenAI (@OpenAI) September 20, 2023
One of the major updates is the integration with ChatGPT, allowing users to refine their image requests through interactive conversations with the chatbot. This means users can now receive the generated images directly within the chat application. OpenAI has scheduled the release of DALL·E 3 for ChatGPT Plus and enterprise customers in October, with a broader release for the public and API customers planned for later this fall.
The tool's ability to produce high-quality images that closely match user queries is noteworthy. For instance, DALL·E 3 can generate images by meticulously following intricate descriptions and manage in-image text generation, such as labels and signs, a challenge for earlier models. OpenAI's promotional materials suggest that DALL·E 3 can render objects with minimal deformations, adhering faithfully to the provided prompts.
Safety and Ethical Considerations
OpenAI has also emphasized its commitment to safety and ethical considerations. The company has introduced measures to enhance the safety of DALL·E 3 and minimize algorithmic bias. In response to concerns raised by artists about image generators, DALL·E 3 has been programmed to decline requests that seek images in the style of living artists. Moreover, artists now have the option to exclude certain or all of their images from being used in training future OpenAI image generation models.
In addition to these measures, OpenAI has announced collaborations with expert contractors to conduct “red teaming” of its products, aiming to identify potential biases and other issues.
The Competitive Landscape
While DALL·E 3 is poised to set new standards in the realm of image generation, OpenAI faces competition from other tools in the market. Open-source tools like Stable Diffusion and offerings from various tech companies are also vying for a share of the market. However, with its advanced features and the backing of OpenAI's reputation, DALL·E 3 is well-positioned to lead the way in AI-driven image generation.
Recent examples of AI Image Generators
- OpenAI has also introduced ShapE, a generative model that can create 3D models from text, opening up new possibilities for AI in image creation.
- Stability AI, a startup that focuses on generative AI, has released StableStudio, an open-source web app that uses its Stable Diffusion model to generate images from text prompts. Users can also use DreamStudio features to make multiple variations of an image with different styles and attributes.
- Meta, the company formerly known as Facebook, has unveiled I-JEPA, its own AI image generator based on its generative transformer model. I-JEPA can learn the associations between words and images, and generate realistic images from text descriptions.
- Alibaba, the Chinese e-commerce giant, has launched Tongyi Wanxiang, a generative AI image generator that can handle both Chinese and English languages. Users can customize the image output parameters using Composer, a large model developed by Alibaba Cloud.
- Chip giant Nvidia debuted its Perfusion AI art creation tool in August.