At OpenAI, one of the world's most ambitious artificial intelligence labs, researchers are building technology that lets you create digital images simply by describing what you want to see. This blend of language and images could serve graphic artists—it can also alter our perceptions about art, ethics, and the future.
Photo by Liam Charmer
AI that Draws Anything You Ask It: DALL-E
In a nod to both WALL-E, the 2008 animated movie about an autonomous robot, and Salvador Dali, the surrealist painter, DALL-E is the name of an AI. It takes verbal requests from users and then, through its knowledge of hundreds of millions of images across all of human history, creates its own images. Type in "cats playing chess" or "bear playing violin on stage" and Dall-E will make it for you, in almost any style you want. DALL-E can also edit photos. Let's say you want to switch the bear's violin for a trumpet. All you need to do is ask for it and a trumpet will appear between the bear's furry arms.
The results are shockingly good, which is why, since its limited release, DALL-E has inspired both awe and concern from people who have seen what it can do. On one hand, DALL-E and those like it seem like an avenue to power a creative revolution, and on the other, their potential to replace graphic designers is not flying well with some, obviously. There is a train of thought that eases at least some of the worry though, and it has to do with how we view the AI in relation to the artist.
Photo by Trollinho
AI Perception: Don’t Anthropomorphize It
The whole idea of AI robots taking over people's jobs seems like something out of an especially dystopian Black Mirror episode. That's why DALL-E creators don't like to anthropomorphize AI—perceiving it as an anonymous entity simply freaks too many people out. Don't believe us? Just see the recent news about Blake Lemoine, a Google Engineer who claimed its AI is sentient. He was placed on paid leave after putting out claims that the company's Language Model for Dialogue Applications (also known as LaMDA) had consciousness and a soul. Lemoine also claimed that in order to be experimented on, LaMDA had to first grant engineers permission. However, most independent experts, as well as Google itself, were quick to dismiss the idea, pointing out that if AI seems human, it is only because of the massive amounts of data that humans have fed it.
In fact, the kind of AI that DALL-E is fundamentally imitating us; it is designed for it. Powered by a neural network (an algorithm that mimics the inner workings of a human brain), DALL-E 'learns' what objects are and how they relate to each other. It does so by analyzing them and the human-written captions about them. DALL-E's product manager, Joanne Jang, claims that the process is not unlike showing a child multiple flash cards—if it sees a lot of pictures of avocados captioned "avocado", it learns what an avocado looks like. And if you type up "a dancing avocado wearing sunglasses", DALL-E will draw on what it knows about avocados, sunglasses, and the concept of dancing, putting everything together in a logical interpretation.
With all the major breakthroughs DALL-E is making, it is only natural that many follow suit. Google recently released a similar AI called Imagen, and a comparable generator called Midjourney, which The Economist used for a cover image, was released in beta around the same time as DALL-E 2. There's even a DALL-E "Light", now called Craiyon, made by the open-source community for public use.
Photo by Dan Cristian Pădureț
All’s Not Perfect with DALL-E
All this said, the technology is not without its flaws. DALL-E is still in what OpenAI calls a "preview" phase, meaning that it is being released to just a thousand users a week as engineers continue making tweaks. If you ask the current DALL-E something that it has not seen before, it will provide an approximate guess, often being far from the original command. Plus, despite the generally high quality of the image it renders, there still are areas that turn out blurry or a bit abstract. And most surprisingly, it has a hard time figuring out how many fingers humans are supposed to have, regarding it as arbitrary as the number of leaves you'll see on a tree.
DALL-E is also imperfect by design. The AI is intentionally faulty at generating photorealistic faces, rendering wonky eyes or twisted lips on purpose to protect against the tech being used to make deepfakes.
Despite its limitations though, DALL-E's small but growing number of users are posting to social media at a fever pitch. More and more people are playing around with it and its knockoffs, sharing the results to their platforms and pages. And even though it is free for the time being, OpenAI does eventually plan to monetize all this interest by charging users for access to its interface, intending to position it as an artist's tool, and not a replacement. While the latter fear is not entirely unfounded, the technology can potentially serve as a creative copilot in creating entirely new art forms as well.