DALL-E 2 Model
Jump to navigation
Jump to search
A DALL-E 2 Model is a text-to-image model that is based on a deep neural network trained to generate digital images from natural language descriptions called prompts.
- Context:
- It has been developed by OpenAI.
- Example(s)
- …
- Counter-Example(s):
- See: OpenAI, Transformer (Machine Learning Model), GPT-3, VQ-VAE-like mode, dVAE, VAE.
References
2022
- (Wikipedia, 2022) ⇒ https://en.wikipedia.org/wiki/DALL-E Retrieved:2022-12-12.
- DALL-E (stylized as DALL·E) and DALL-E 2 are deep learning models developed by OpenAI to generate digital images from natural language descriptions, called "prompts". DALL-E was revealed by OpenAI in a blog post in January 2021, and uses a version of GPT-3[1] modified to generate images. In April 2022, OpenAI announced DALL-E 2, a successor designed to generate more realistic images at higher resolutions that "can combine concepts, attributes, and styles". OpenAI has not released source code for either model. On 20 July 2022, DALL-E 2 entered into a beta phase with invitations sent to 1 million waitlisted individuals; users can generate a certain number of images for free every month and may purchase more. Access had previously been restricted to pre-selected users for a research preview due to concerns about ethics and safety. On 28 September 2022, DALL-E 2 was opened to anyone and the waitlist requirement was removed. In early November 2022, OpenAI released DALL-E 2 as an API, allowing developers to integrate the model into their own applications. Microsoft unveiled their implementation of DALL-E 2 in their Designer app and Image Creator tool included in Bing and Microsoft Edge. CALA and Mixtiles are among other early adopters of the DALL-E 2 API. The API operates on a cost per image basis, with prices varying depending on image resolution. Volume discounts are available to companies working with OpenAI’s enterprise team.
The software's name is a portmanteau of the names of animated robot Pixar character WALL-E and the Spanish surrealist artist Salvador Dalí.[2][1]
- DALL-E (stylized as DALL·E) and DALL-E 2 are deep learning models developed by OpenAI to generate digital images from natural language descriptions, called "prompts". DALL-E was revealed by OpenAI in a blog post in January 2021, and uses a version of GPT-3[1] modified to generate images. In April 2022, OpenAI announced DALL-E 2, a successor designed to generate more realistic images at higher resolutions that "can combine concepts, attributes, and styles". OpenAI has not released source code for either model. On 20 July 2022, DALL-E 2 entered into a beta phase with invitations sent to 1 million waitlisted individuals; users can generate a certain number of images for free every month and may purchase more. Access had previously been restricted to pre-selected users for a research preview due to concerns about ethics and safety. On 28 September 2022, DALL-E 2 was opened to anyone and the waitlist requirement was removed. In early November 2022, OpenAI released DALL-E 2 as an API, allowing developers to integrate the model into their own applications. Microsoft unveiled their implementation of DALL-E 2 in their Designer app and Image Creator tool included in Bing and Microsoft Edge. CALA and Mixtiles are among other early adopters of the DALL-E 2 API. The API operates on a cost per image basis, with prices varying depending on image resolution. Volume discounts are available to companies working with OpenAI’s enterprise team.
- ↑ 1.0 1.1 Johnson, Khari (5 January 2021). "OpenAI debuts DALL-E for generating images from text". VentureBeat. Archived from the original on 5 January 2021. Retrieved 5 January 2021.
- ↑ "DALL·E 2". OpenAI. Retrieved 6 July 2022.
2021
- (Berkeley University, 2020) ⇒ https://ml.berkeley.edu/blog/posts/dalle2/
- QUOTE: ... The transformer is arguably the meat of DALL-E; it is what allows the model to generate new images that accurately fit with a given text prompt. ...
- QUOTE: ... The transformer is arguably the meat of DALL-E; it is what allows the model to generate new images that accurately fit with a given text prompt. ...
2021
- (Ramesh et al., 2021) ⇒ Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. (2021). “Zero-shot Text-to-image Generation.” In: International Conference on Machine Learning.