DALL-E 2 is amazing, use it responsibly
- Posted on August 26, 2022
- Estimated reading time 7 minutes
The photos used in this blog post are permissible for use under OpenAI’s Sharing & Publication policy.
DALL-E 2 has just made it into a wider release, meaning a lot more people now have access to this incredible image generation tool from Open AI. Azure is Open AI’s sole cloud provider, and we will see a close relationship between their services and Microsoft in the future. We wanted to show some of the outputs and discuss how this might change the way we work and prototype in the future. All the images in this blog were generated by DALL-E 2, so, they were created when a prompt was typed in and the algorithm was set loose on our description.
DALL-E 2 has an amazing ability to understand human language thanks to it’s sister product GPT-3. DALL-E 2 then takes this understanding to produce some amazing and relevant high-quality images. DALL-E 2 was trained on a huge corpus of image information, so there isn’t much it doesn’t know how to generate, in any number of styles, from photorealistic, to the style of your favourite artist. We’ll cover some cool and interesting things we can do with DALL-E in enterprise as well as some of the key considerations for responsible use.
An obvious first step is to generate images that we need for content. Using the example of the Avanade orange waves, what happens if we ask for a brand-relevant image?
Orange waves with a technology focus
Trying this prompt gets us immediately, alarmingly close to human-generated stock imagery that we use every day at Avanade, and you could try it on your own branding. The difference is this is completely AI-generated. So, what if we wanted more specific imagery? Imagine we had just written a blog about security and wanted a relevant and on-brand image. Normally, we’d trawl through a reference library of images that had been purchased, or created by humans, but what if we leave this up to AI?
A 3D rendered padlock on a background of orange waves
Looking good. This completely novel image would take hours to recreate in a photo studio, or as a 3D render, and now we can have it in seconds. We can make small tweaks to the prompt to get entirely new images with the same theme, or even choose a close enough image and generate some alternative versions by sending it back to DALL-E 2. We could even upload real images we already own to get variations on real images.
New variations from a source image
As well as other variations from different styles. Let’s try a quantum computing security scenario.
An icon of an atom on a blue padlock
This is great for photos that don’t exist, but we can also create temporary placeholders for designs. It’s common to use mockup or example images in website wireframes for example. Using DALL-E 2 though, we can generate more relevant placeholders so we can build on our designs.
A smart phone on a desk with a photo of an orange on the screen
Note in this example DALL-E 2 picked a great colour for the smartphone case. This doesn’t just apply to photography though, DALL-E 2 can generate in lots of different styles.
A group of people working around a computer in an orange branded office - digital art
In this example adding the tag of ‘digital art’ results in a funky graphic design that we could use as a placeholder or as inspiration for a graphic designer to work off. A specific need for this came when we were building out wireframes for our confidential compute project. We needed a photo of a person holding a security badge up in a specific way to inform the users what to do at a certain login step. When the product is released, we would like to have this image made for real as accurately as possible, but what about for user testing when the website is still in its infancy.
A person shows a photo badge
DALL-E 2 can give us a much more relevant image than is available from our stock photos, without having to resort to downloading something from Google where we may be ignorant of the source, or buying a more specific stock image.
Sometimes though, DALL-E 2s outputs are more than good enough in their quality to use right out of the box, especially when comparing its paintings to what those of us with little to no creative talents could muster.
A painting of the Seattle space needle in a cloudy orange sky - digital art
Using image generation tools responsibly
While we may be excited about the interesting and helpful ways DALL-E 2 and similar engines might change the way we create images, as with any emerging technology, it’s important to think about the various ethical impacts they might have if adoption were to become more widespread in business. In our own internal discussions so far, we’ve uncovered a number of general considerations for image generation engines, knowing that additional issues may arise or become more relevant depending on the use case.
The first consideration is the financial and professional impact on graphic designers. It’s not hard to imagine a cost-conscious executive hoping to trim payroll by employing DALL-E 2 instead of a team of artists. But we could also imagine an executive in the same position licensing DALL-E 2 for a team of artists as a tool for them to push their artistic creativity, strengthen their skills, and generate more engaging imagery. The motivation here will make a big difference when gauging the ethical impact.
It’s also worth considering whether a team’s creativity (and their feeling of license to be creative) would improve or decline in the long run. If we asked 4 human designers to create an image of “a 3D rendered padlock on a background of orange waves,” would we have a more diverse and interesting selection of images to work with than if we asked the same of an image generating model? If a majority of creative teams start using DALL-E 2 or something similar on a regular basis, will the industry gradually become less interesting?
There’s also the indirect financial impact on artists and designers whose livelihoods depend on copyrights and royalties. It’s even possible that an individual’s unique design or artwork was one of the hundreds of millions of publicly available images used to train DALL-E 2, although they wouldn’t know it. In that case, an artist might now be competing for a job against an engine they contributed to, however small that contribution. Accessibility matters a lot in this consideration, as some professionals will be much more likely to benefit from these models based on their job, region, spoken language, physical abilities, and access to sufficiently advanced devices.
Another key consideration is the impact of a sudden proliferation of “citizen” designers. Without the common corporate design processes with creative and legal reviews, it may be too easy for any employee to generate images that look close enough to “on-brand” without having the training required to spot brand violations or inappropriate content. The team at OpenAI has worked hard to remove obviously inappropriate imagery, but there are ways to get around these filters, and there are many other ways for content to offend. For example, DALL-E 2’s filters block images of dead animals, but not images of animals laying in a pool of ketchup. (Note, it was an easy decision not to post these images, and our intention is not to suggest that DALL-E 2 images are often grotesque, but rather that the model’s filter alone cannot account for every definition of appropriate versus inappropriate.)
A DALL-E 2 content policy prompt
One last consideration that may be harder to evaluate is the environmental costs of training and running image generation engines like DALL-E 2. The high energy costs and subsequent carbon footprint of such engines are well-documented, and while there are efforts underway to reduce energy consumption, the overall impact is remains high, especially when accounting for the initial model training processes, not just individual queries. And companies working to fully address their environmental impact should also consider the substantial material costs associated with manufacturing the hardware that supports these engines.
A final note on responsibility
Ethical use of technology should never be a simple matter of risk assessment and mitigation. Most technologies have the potential to positively impact the world, if sufficient care is given to design, development, and use. For image generation engines like DALL-E 2, we can highlight opportunities for ethically positive outcomes, such as making the generation of art more accessible, especially for people with impairments that make visual arts more difficult. We might also consider using these engines for art and language education or even as tools for therapy.
Once again, the ethical impacts of technology depend greatly on intention.
Comments