Yet more ramblings about AI generated art

Ever since learning about the prevalence and progress of AI-generated art, mainly via such a piece having won an art competition, it’s not left my mind for too long. I was kind of confused how best to think about it then, and it’s only got worse as time goes on. Herein are some further ramblings.

As a reminder, these are systems where you type in a prompt, e.g. “Picture of a juggling elephant” and out pops a picture of whatever you asked for. Right now the results aren’t always great, hence the new vocation of “prompt engineer” - someone who knows what phrases to give to an AI for best result. But they’re often certainly good enough to be used commercially and beyond.

There exist several famous such systems, including DALL-E, Stable Diffusion and Midjourney. Importantly, the way they work is dependent on them having processed vast arrays of existing, presumably human, art. This is typically sucked in from the internet without explicit permission from its creators. No-one really thinks computers just became innately artistic in the traditional sense; but rather that we developed algorithms that allow a computer to translate an arbitrary text string into an image output based on what it’s derived from all the art and contextual information it’s already seen.

This Guardian article gives a quick overview of roughly how these systems work. The below video goes through some slightly more mathy details.

Given the genie is well and truly out of the bottle, I think it’s inevitable that we’ll see more and more usage of AI art over time. The systems are already perfectly usable by almost anyone who can use the web, and will only get cheaper and easier to use over time - for an individual end user even now it’s often free. If nothing else, we can be confident that the capitalism’s invisible hand typically pushes us towards the immediately cheapest - often meaning most generic and least human-skill-requiring - solution to any perceived need, no matter the external cost. You’d basically need some kind of world-wide usage ban to stop it, which isn’t going to happen.

Perhaps there’ll be a set of people who are dead against the technology. It may well never replace human artists in their entirety. But it’s going to - and perhaps already has - replace a good number of them. In doing so, it’ll inevitably change what the world looks like for the rest of us.

After all, even if it turns out that the very ‘best’ art is somehow only ever possible via human production, there’s a ton of current use cases where fairly mediocre art is acceptable to businesses and individuals. Particularly when it’s extremely cheap to produce.

Automation replacing people in jobs is very much not new, but it’s traditionally been seen un jobs with perhaps less visible output, considered more rote, and unfortunately probably thought of as being of lower status or worth than the typical romantic perception of the artist. Being an artist is aspirational, if sadly unobtainable, for many folk. That’s not to say there aren’t plenty of unpleasant, tedious, ill-paid and abusive artistic jobs. Maybe it won’t be terrible if some of those disappear, as long as the people involved are cared for. But the trait of being “artistic” is often seen as a very human, and desirable trait. Those considered to be near the top of their game are widely admired and honoured.

Whilst access to AI art generators is somewhat restricted or at least niche at present, that’s inevitably not going be the case for long. Microsoft Office, that most unexciting and ubiquitous of workplace software, is gearing up to add a “Microsoft Designer” component that generates illustrations based on the sentences you type. Their promo video shows someone creating a poster for their bakery business based on typing “Cake with berries, bread and pastries for the fall”, no graphic designer needed. I guess this is the modern day version of clipart.

Stock photography as a business might also be on its way out, at least in terms of how it works today. Already there are stock photography sites that offer to generate you an AI-created image if they don’t already have something that meets your satisfaction. That’s one less human photographer or artist receiving credit and payment for their work. In fact one of that site’s dedicated tools reads your blog post text to create you an appropriate cover image. I might try that on this post just for fun when I’m done.

Update: I did just that. Here’s the image stockai.com generated when fed with the text from this post. It was free and took maybe 45 seconds.

That’s an example of how the technology could be additive rather than substitutional. There’s no way I would have commissioned an artist or even browsed stock photography sites for this humble blogpost. No-one lost out (at least not directly, certainly the consequences of participating in shifting norms are debatable). But it also seems very unlikely that many profit-focussed companies who currently invest in some kind of artistic output will keep spending the money they currently do if they get idea that they don’t need to into their heads.

The Atlantic magazine already got into some controversy over using an AI generated image to illustrate one of its newsletters. The Economist used an AI to design one of its covers. I’m sure there are many more examples, whether we know about them or not. Let’s also remember than an increasing number of the more mundane stories you read in certain publications are written by AIs - the automated journalism trend, so it’s not only visual creativity at play here.

But worrying about people losing jobs due to technology is seemingly not something society tends to do in earnest. The Luddites who literally battled the industrialisation of the textile business in 19th century England didn’t win; they became an insult.

Sentiments along the lines of “don’t worry about it, they’ll just get better jobs right away” or “but this will make things cheaper which will benefit everyone in the end” abound. Sometimes they might be true. Oftentimes probably the former isn’t, at least whilst many of us live in societies that tend to see unemployment as a problem that must be solved by the individual rather than the system that caused it, whilst appearing to have little respect for article 25 of the Universal Declaration of Human Rights. But in theory we could deal humanely and generously with unemployment of all kinds.

What is less obvious to me is what a shift to most “art” we see day-to-day being generated by AI might do to harder-to-define concepts like creativity. The pictures that come out of these AI machines are “new” for sure, in that they never existed before. But they’re all based on some black-box manipulation of existing, human-powered art for now. Some of this is very clear, such as prompts that explicitly request the combination of 2 human artworks that haven’t previously been combined - such as “Kermit the Frog in Blade Runner 2049”

“A still of Kermit The Frog in Blade Runner 2049 (2017)” #dalle pic.twitter.com/CxyWFRJETc
— HeavensLastAngel (@HvnsLstAngel) May 31, 2022

In other, more abstract cases, it’s much less obvious.

But in any case, the machine needs feeding.

One might consider that human artists are also inspired by what they’ve seen. To what extent is a human’s art also a black box “generative process” based on all the artistic inputs they’ve seen in their life? There’s an way in which this has to be at least partly true within a certain scope, given “movements” exist in art - think Impressionism, Modernism, Neo-Classicism, that kind of thing.

But something starts those movements. What I don’t know is whether that’s something qualitatively different to the kind of generative process that the crop of AI art generators we might expect in the next few years could do. Or am I putting humanity on too much of a pedestal? Do we know that human creativity isn’t deep down based on a not dissimilar process? Disclaimer: I haven’t actually done any research into creativity really. There may exist very simple answers to these questions.

But if there is an insurmountable difference, or simply if market forces et. al. result in AIs being attuned towards a diminutive and limiting goal like “optimise your output to please the most amount of people today” then perhaps something will be lost, or at least delayed. Will future historians note a moment where the world got artistically stuck in some sense?

It’s of course possible I’ve gotten this the wrong way round. An AI playing the game of Go beat its human opponent with moves so novel and unexpected that they’ve been described as ‘alien’ and ‘from an alternate dimension’. Maybe one day AI art generators will show us something valuable we’d never otherwise have imagined.

Perhaps some of the more concerning “getting stuck” effect has already happened to some extent via a different algorithmic use case. Witness for instance how Netflix has changed its once acclaimed artistic output based on what it’s algorithms say will sell well. Look at the truly bizarre output that sometimes populates social media at present, where potentially artistic output gets optimised for engagement above anything. It might result in fascinating, important, meaningful new art forms. It can also result in a spate of videos of people eating out of toilets because at some point someone did it and the resulting video went viral.

It’s not clear how “new” this is; I’m sure art has always been driven by incentive. It may just be that accessibility and access to an audiences improved. It’s much easier for me to share a video of myself eating ice-cream out of a toilet on Facebook than it would be for me to paint some kind of Mona Lisa-beating painting and set up an internationally renowned art gallery to show it off.

It’s often easy to forget that just because something is difficult doesn’t mean that it’s “better”. That sentiment is often a metaphorical barrier used to prohibit less privileged sections of society making inroads into whatever sphere the existing powers-that-be want to protect for their own selfish purposes.

A little less dramatically: should my blog go without illustration because I don’t have either the talent or resources needed to go to art school? But also, if I’m not prepared to pay or credit an artist then do I really have the right to use an illustration that was overtly non-consensually derived from work of many other people? As these systems work by ingesting incredible amounts of other people’s art, a whole set of rights issues exist.

Being popular with the current AI art community is not necessarily a blessing for a human artist. Greg Rutkowski is an artist known for fantasy landscapes. But his style is liked enough that his name has apparently become a common prompt for people using Stable Diffusion or Midjourney to generate art, with tens of thousands of images being generated by typing in phrases including his name.

An example shown by the MIT Technology Review used the prompt “Wizard with sword and a glowing orb of magic fire fights a fierce dragon Greg Rutkowski" which produced this:

Now when Rutkowski Googles his name he gets art that isn’t his. And he has no rights over the AI-generated images. So if some entity wants to use some work that an AI explicitly created based on having ingested hundreds of his images without permission then they’re presumably free to do so. This has understandably left him with the opinion that “A.I. should exclude living artists from its database”. Which prompts me to additionally wonder about if there’s any ethical considerations of note surrounding the possible AI exploitation of a dead artist’s work.

For now though, if the artist’s images are on the internet then they’re considered fair game. These systems don’t ask artists to opt in to having their work included, and it’s not clear that they even have the ability to opt out them. Some tools that at might at least let you know if your work has been used to train these AI systems have started to appear, including Spawn’s Have I Been Trained?.

To what extent this is different from a human artist enjoying Rutkowski’s pictures and later creating something in a similar style is something that perhaps still needs to be thrashed out. After all, the ingested pictures were on the publicly accessible internet in the first place. But the big difference here is of course accessibility and scale. The new part is that is that a single person or business with no art knowledge or skill can generate hundreds of “original” Rutkowski style pictures a day with no artistic ability and little-to-no resources.

This opens up new possibilities for the masses, which could be a positive thing - see the optimistic possibility Wired presents of “dramatically expanding the number of people able to generate and experiment with art and illustration”. But left without consideration, it may also result in a pretty devastating exploitation of people’s hitherto highly valued work. And potentially a world where it feels like we’re immersed in some kind of Borges' style Library of Babel but for whichever genre of art is algorithmically popular or convenient, with a possible plot twist of any remaining human artists being accused of being AIs.