AI is an Idiot Savant

... and this is great

Apr 10, 2022

I have a confession. I've always wanted to play music. But lacking the dexterity of limbs of hours of practice, I can’t. I can’t tell you how frustrated I was at the fact that most musical instruments seemed barbaric - you physically blow air into a wooden tube? Or pluck strings. Or literally bang my hands on things. Which meant in college I came up with an idea for a better musical instrument. This would be a square, with four quadrants. You would play it by sliding your fingers on each quadrant to control pitch and rhythm. It would look less cool shredding on stage but might be easier to play. After a while of messing with it eventually I gave up, because touchpads weren’t that great yet and I had other things to obsess about.

Now, while this tells you more than it should about my college days, I remember this when I read about modern AI progress. I’d wanted to create music. I’d have liked to paint too. But lacking an extra half a decade, I’d never be able to create something outside the way I see it inside. This is because it takes a huge amount of skill and learning before you are able to teach your body to create things the way your mind conceptualises them.

And this is mainly why I love AI art. I don’t know if any of you have noticed, but the image for Strange Loop Canon originally used to be this picture of St Paul’s Cathedral in London, which I wanted to have painted a particular way, and the only way I could get there was by training a style transfer algo and a huge amount of free compute from Google Colab.

I’m not saying its good, by the way. I am saying the only way I could have made this was with the help of a friendly AI.

The process of identifying a vision and speaking just so to transform it from a feeling in one's mind to an actual painting is incredible. That shouldn't be made inaccessible because some of us aren't blessed with the vision to create it de novo. Art, just like every other human endeavour, has to find its own leverage points to make sure artists can create art and not just hone their skills.

But still, there’s something supremely uncomfortable about something that seemed so close to the human universal (creating art) becoming mechanised. It is as if we have managed to bottle up something crucial about being human and thus made it anodyne, commonplace, less worthy.

For what its worth, if what I do were supplanted by a commodified algorithm, I'd find that discomfiting too. To press a button and get an average Strange Loop essay would make me sad. Maybe it would push me down the hill of giving it up, or maybe strive to do better than the algorithm, Sisyphean though the task might be.

But this too is progress. Because progress is inanimate and impersonal. It's not for me but the next billion artists to make their inner visions outer reality without first figuring out how to use a brush.

In commodifying a skill we will have also removed a piece of embodied learning from ourselves. That's the price we pay. The half a decade of skill training I mentioned above is not simply skill training. That’s also the time when all the other skills get honed. We might start doing do-re-mi in music classes, but doing the scales also helps embed the possibilities within us.

That's the discarded effluent which fuels the ability to leverage and do so much more. What will be the equivalent in painting? I don't know but I'm damn sure there will be one.

The whole point of art was that it's a universal language. A way to express the ineffable directly from one cortex to many. But that catholic nature has now been turned personal. For each of us to try and find meaning that speaks to ourselves. And there's beauty in that too!

Part of the anguish at least seems because we're used to seeing art as a unique expression of various ineffable human longings. Variation, then selection, however seems to be at least partially in effect as the magic behind our artistic genius. Art that we love today, even the most impressive pieces, can be considered the best examples of a particular scene. For example, this is a painting I adore. I could stare at it for hours, and have.

This isn’t a unique point of view. The subjects - Madonna, the angel, Jesus and John the Baptist - have been depicted in similar situations and tableaus multiple other times. Including by the most famous and forgotten artists of the day.

The lesson is that the existence of a multitude of copies of a thing isn’t enough to stop one of them from speaking to our very souls. And because souls are heterogenous, whether you believe in them or not, the one that speaks to me might not be the one that speaks to you!

Like all commodification, as we’ve learnt from the past, this will lead to a bimodal distribution of folks working within the space. It’ll give rise to an extraordinary number of hobbyists (I’ve always wanted to make a graphic novel, and now I can) and a few extraordinarily talented artists (highly paid and respected for their extraordinary real and/or perceived skill). Who this hurts is the vast majority in the middle!

So once we have replaced artists with DALL-E and midjourney and their descendants, what exactly do we do now? Are we just doomed to live lamenting the loss of an indisputably human domain? What’s next? Three options.

Someone needs to create the next style

While the tools we have today can paint cats playing piano in the style of Picasso, but you still need Picasso to create that style.

(The interesting bit for me is that they're so similar to our styles yet arrived at it so differently. It's like finding carcinization to be true but in a gas giant. Convergent evolution is trippy.)

Will this change? Probably. But we are impatient beings with no real conception of what we want. Art is about finding out. This too means that skill in art may not be about pixels and compositions but finding the right admixture of them to tug at each others heartstrings.

This isn’t meant to be bulwark that stands forever. Just as we can do sophisticated style transfers creating tableaus of teddy bears painted as if by Caravaggio, we might also soon start being able to create representations such as combining elements of Klimt with Picasso.

But combination of elements is only one of the inputs. This wouldn’t necessarily get us to discover pointillism, or cubism, or even the manic genius of Jeff Koons. Until then, we have a clear role to play. Humanity can evolve beyond reproductions within styles or various recreation combinations of existing portrayals and instead focus on making net new creations. This will be a higher risk endeavour, but that is the right way for it to be.

Direction of art will be the emerging field

Creation of a piece of art comprises multiple functions the artist has to play. She has to decide the subject of the painting, the components within the painting, the exact composition and arrangement of those components, the styles to use to craft this composition, and most importantly a vision of what it needs to evoke from those that see it.

The emergence of the tools helps make some of these much easier than they used to be. Trial and error, which heretofore could only be done by some select large art schools or folks like Koons with huge studios and students, can now be created with admirable ease. The selection of specific components and subjects and even the styles can be done as if via a dropdown menu.

But the most important bits of the vision and the framing and composition are still to be done. For instance, this is “human and AI fall in love and create the future children of the milky way galaxy”

This is great! But note that we can actively affect the choices of every single thing here - the colours, the characters themselves, the placement of the galaxy, even the action that depicts love and children. The creation is great, the direction is the important bit.

Or here - these are both cityscapes on Mars, and about as different from each other as could be! And a thousand other variations remains to be created.

Left is DALL-E 2 and right is midjourney

Also, there are also plenty of blind spots still. Ironically for a computer the attention to detail is severely lacking. The paintings all feel unfinished, because in a very real sense they are. This might get solved over time, but it also seems ripe for someone actually versed in this to take it the last few percent.

Someone needs to create the training materials

Since art is about creating something that sings to our soul, there will always be a part of us which tugs towards the art with a story, since knowledge of that story is part of the art itself.

But while DALL-E 2 and midjourney can create wonderful evocative art pieces, they are also of a moment. They are able to do this because today they are able to use the giant corpus that is human art creation until now as the training grounds with which to train.

But this is a static picture. Will the sort of self-developed learning like what AlphaGo Zero did be enough to create a new wealth of styles? It’s unclear as yet.

Andy Zeng@andyzengtweets

With multiple foundation models “talking to each other”, we can combine commonsense across domains, to do multimodal tasks like zero-shot video Q&A or image captioning, no finetuning needed. Socratic Models: website + code: socraticmodels.github.io paper: arxiv.org/abs/2204.00598

3:27 PM · Apr 7, 2022

326 Reposts · 1.44K Likes

For the immediate future at least we’d seem to need human ingenuity to create novelty that’s sufficiently different from the reconceptualisations of what’s existed so far, but also not sufficiently alien that we can’t even understand it.

“Is this bad” seems like the wrong context of a question. Anything that improves our ability to manifest what we want is a good thing. The disruption it will create to those who made their livelihoods from this is a bad thing. But they're not equal nor are they equivalent.

The increase in our ability to surf the memetic expanse of art is a good thing. Expansion of the possibility space and enabling us to see further into that frontier is a great gift. As is the democratisation of base artistic ability by the way. The joy I feel in conceptualising something and making it come about is no less than if I'd sketched them myself. If xkcd can show the hilarity and complexity and absurdity of the whole world with stick figures then surely the talent isn't just in creating the visual but almost everything else.

As to whether this is the first knell sounding out our inevitable demise, nobody really knows. Fatalism however has rarely seemed a helpful strategy in these cases even if that were true. I’d prefer to think we can find new areas for us to excel rather than bemoaning the loss of an existing one, much as we bemoan the loss of the arts of letters and postcards, handwriting, calligraphy, even printing.

Dealing with the broken dreams of our skilled artisans is also the story of our collective progress. Let's not let the tears for what we lost cloud the visions of what we can now do.

Sharon John

Idiot savant is the best term for this. I always wonder why we feel this constant anxiety about new technologies. Is it because we attach our meaning and self worth to our abilities since that’s how we’re used to relating to other humans? If that’s the case, we need to move to an older idea of human essentialism.

Seems like as you noted on twitter, the trend is for more areas to ‘automate the execution’ so people with vision and interest keep seeing their powers amplified.

Michael Feng

I'm excited to try out DALL-E but I'm not overly worried about commodification of art or other negative externalities, because I view the value of art as the process of creation, the journey, rather than the final output. Using DALL-E is a different process, but if what I love is the physical mastery of a skill or the evolution of an idea into an composed article, then I just wouldn't use it.

Strange Loop Canon

Discussion about this post

Ready for more?