Augmented creativity

21 February 2021 Filed under: creativity

Augmented creativity

Truly creative AI is still some way away. Meanwhile, though, collaboration between humans and AIs holds exciting promise.

There’s an idea that’s been doing the rounds for a while that, soon enough, AI will become creative. Whole works of art – songs, paintings, poems – will be brought into being entirely by computers, throwing human culture into disarray.

We’re not there yet. We might never be. But we are at an interesting juncture, and what’s right around the corner has the potential to be just as interesting as the endgame.

For a vision of what’s to come, we might look at a world that’s already been completely upended by AI: chess. Today’s computer chess engines are almost comically powerful. It’s been 24 years since the early milestone of Deep Blue beating Gary Kasparov in 1997, while running on a purpose-built supercomputer that cost over $100m to develop. Now we’ve reached the stage where publicly available chess engines, costing $50 and running on a standard laptop, can beat any human player. The dominance of chess AIs has sparked a cheating crisis and worries that the creativity, mystery and beauty of the game are being destroyed.

Before this dominance, though, there was a liminal period, where chess AIs were strong but fallible; where they could mostly beat humans but had strange quirks or blind spots that meant they wouldn’t necessarily crush them every time. At that point, it turned out that a human and an AI working in tandem were greater than the sum of their parts. The AI might offer up moves that could then be curated by the human player. The human player might spot errors or simply contribute a different perspective to the computer; likewise, the computer might suggest moves that the player would never independently have thought of. The human player’s creativity was being augmented by the AI, and the result was a player that could beat other computers and other humans alike.

This is a form of zeroing out. Rather than the human player relying on their own internal model of chess, which is likely to be hugely flawed, they start with an accessible, state-of-the-art model in the form of the AI. Then, they see whether they know something that the AI doesn’t and, if they do, they make a tweak to its choices. State of the art plus a tweak stands far more of a chance of being right than you do alone, and stands a far better chance of being the best outcome overall. Human creativity augmented is stronger than human creativity by itself.

We’re not quite at the stage where this is possible with most creative fields, outside of games with rigid and definable rules. But we’re tantalisingly close, and nowhere are we closer than with written language.

That’s mostly because of the great leaps forward made in the last couple of years by GPT-3, the third generation of OpenAI’s “Generative Pre-trained Transformer”. GPT-3 was trained by ingesting an almost incomprehensibly large quantity of text – effectively the entire internet, all of Wikipedia, hundreds of thousands of books, and more besides. You interact with it via text prompts, almost as though you’re having a conversation; you write to it in text, and it responds to you in text. You can provide it with the set-up line to a joke, and it’ll complete it; or you can write a paragraph of an essay, and ask it to finish it; or you can give it the same sort of open-ended prompt you’d give a child (“write a story about a cat”). The difference GPT-3 makes, compared to past attempts to solve this kind of problem, is that it doesn’t need to be specifically trained to solve particular problems; it’s general-purpose, and can be asked to write virtually anything you can think of.

The results are fascinating, remarkable, and often spooky. The writer Gwern Branwen has conducted many in-depth experiments to explore what GPT-3 is capable of. Some of my favourites are his attempts to prompt literary parodies – to get GPT-3 to write things in the style of a famous author, which it can do because it has read everything by every famous author.

For example, given a prompt to write Harry Potter in the style of Ernest Hemingway, GPT-3 comes back with:

Voldemort wanted something. But what?

He had a purpose. He wanted to destroy all of creation. He wanted to end it all. He could have that. He would have that. He didn’t know yet that he could have it. Voldemort had created Harry. Voldemort had never really destroyed Harry. Harry would always be there, a vampire, a parasite, a monster in the kitchen, a drain on the household, a waste on the planet. Harry would never be real. That was what Voldemort wanted. That was what Voldemort wanted to feel. He would have that. He would have everything.

He had a purpose. He was working on something. He had no idea what it was. He was a perfect tool. He was born to be used. He was no one at all. Harry didn’t exist. The little boy with the lightning bolt scar didn’t exist. The boy who had lived did not exist. He never had lived. He never would have lived. He was dead. He had been dead. Voldemort wanted Harry dead. Voldemort wanted Harry dead so he could be born again. Voldemort was God.

Harry was God.

Voldemort was God.

And Harry, the god of Death, was not going to allow that to happen.

Not yet.

Not ever.

The lightning bolt scar burned.

The boy ran for his life. He knew that he would die. He was alone. His best friends were dead. He knew that he would die. He knew that he was a waste. He was nothing at all.

He had no past. He had no future. He was dead.

He hated his parents. He hated everyone. He hated the Ministry of Magic, the newspapers, the newspapers, the Daily Prophet, and the Daily Prophet, and the headlines, the headlines, the headlines of the Daily Prophet, “THE LIGHTNING STRIKE SCAR CONFOUNDED.” “THE BANE OF AZKABAN, THE BOY WHO LIVED.” “THE SON OF THE ANCIENT DARK LORD.” “THE STOLEN PRINCE.” “THE JOKER.” “THE DARK SON.” “THE BRIAN BOYLE.” “THE BOY-WHO-MESSED-IT-UP.” “THE LITTLE MAN WITH THE GREAT BIG SCAR.”

It’s remarkable. It hangs together; it flows; anyone with a passing knowledge of Harry Potter will recognise the consistency of the characters and the world, and anyone who’s read Hemingway will recognise the style. Okay, there are some weird bits too. (Who’s Brian Boyle? The hockey player? Or the triathlete who survived a car crash when he was younger? Has the AI picked up on the mirror between Harry Potter’s survival and Brian Boyle’s?!) The whole thing sits uncomfortably in the uncanny valley – almost there, but not quite – in an unsettling way.

What’s interesting is that we’re already well past the point of it being remarkable that a computer can string a sentence together, or even create an internally consistent passage of text. Here we have a computer able to distinguish between content and style, seemingly able to understand the fundamental elements both of a fictional world and of an author’s stylistic ticks. And it hasn’t been trained to do that: it’s just absorbed it naturally, as part of reading basically everything that’s ever been written. It’s terrifying and wonderful.

As Branwen says:

“It’s amazing to think that GPT-3, which is essentially nothing but an old obsolete 2018 neural net scaled up and trained on random Internet pages, somehow just magically learns all of this abstraction and is able to casually merge Harry Potter with scores of authors’ styles given nothing but a slight nudge – no Gram matrix, no reinforcement learning, no fine-tuning, no nothing, none of this designed into GPT-3 in the slightest way or these capabilities known to the original OpenAI researchers. What else can GPT-3 be prompt-programmed to do…?”

Matt Webb has described using GPT-3 as a way to “dowse the collective unconscious”, revealing ideas that are latent in what humans have collectively written but that haven’t necessarily been explicitly expressed:

“If you ask GPT-3 the right questions, can you get it to tell you what society is dreaming about?”

Some of the creepiness of GPT-3, I suspect, comes from this: it generates ideas that feel as though they’ve been lurking in our unconscious, novel but plausible, just waiting to be uncovered. And just as great creative work often hinges on writing the right brief, getting the best from an AI requires writing the right prompt – and then curating and directing the response effectively.

That begins to suggest a way of working with AIs to augment creativity:

A human hits a creative block, or wants to solve a particular problem
They prompt the AI in some way – either offering up their initial solution, or simply posing the problem to it
The AI offers responses, potentially unlocking a new direction that the human hadn’t yet considered or might never have got to themselves
Acting as a director, the human influences the path that’s taken next by feeding the AI more prompts, by using the AI’s work as an inspiration for work of their own, or by taking away the AI’s work to refine it themselves

The AI is a creative foil that never grows tired, or bored, or frustrated, or impatient. They can work ceaselessly on the same brief, offering up limitless variations and new directions. And they’re incredibly well-read and so are able to throw up inspiration from the most esoteric sources.

It’s not just text, either. OpenAI, the creators of GPT-3, have already developed Image GPT, which can complete images from prompts to a startling degree. They’ve also developed OpenAI Jukebox, which will co-compose songs given a sample prompt. Jukebox is perhaps the eeriest of the three projects; listening to Jukebox-continued songs can be like driving a car through time, spinning the radio dial as you go, hearing ghostly voices from the past wash in and out amongst the static, their lyrics unintelligible, while a weirdly consistent rhythm section continues to play. I’m sure they won’t be troubling the charts any time soon, but they’re certainly an experience to listen to.

At the moment, access to these tools is open only to a small set of AI researchers, and available only over a slightly intimidating API. But it won’t be long before it’s baked into all sorts of products. Imagine creating a Powerpoint presentation by creating a few slides as prompts and telling the AI roughly what you wanted to say with it? Imagine a text editor that could offer up different endings to your writing, or re-style it? An animation tool that took a static image and animated it in styles of your choice? The inspiration is endless.

For the final lesson, perhaps it makes sense to return once more to the world of chess. In 2005, a club in New Hampshire hosted a “freestyle” tournament in which players were welcome to use computer aids however they saw fit. It attracted grandmasters with international reputations, but it was ultimately won by two outsiders, Steven Cramton and Zackary Stephen, ranked far below their fellow finalists. Stephen summarised the reason for their success:

“We had really good methodology for when to use the computer and when to use our human judgement, that elevated our advantage.”

Perhaps that approach will be the key to future creativity: having the situational awareness to understand when the situation calls for the computer, and when it needs something that only humans can – for now – provide.

Roblog