One of the side effects of starting to play around with godlike tools is that it feels vain to write mere opinions. The chaos these days have been … overwhelming. There are new tools and new startups and new open source models every day! It’s the most exhilarating time to be alive.
And it’s also a time when multiple people are calling foul. Worries about AI misuse, AI increasing the capabilities of our enemies, AI destroying jobs and entrenching the have-nots, AI becoming sentient and destroying everyone, all about, including in major news outlets.
We need to bring this conversation down to earth a little, the Strange Loop take so to speak. We’ve talked about AI extensively in the past. We talked about the impossibility of AI learning, about the wonders that generative AI brings about, against the Butlerian jihad, about the banality of the evil that Bing says, and about the fuzzy processor nature of LLMs today. While they had several actual object level predictions within, I never surfaced them explicitly.
We swing between extreme doomer points of view where all biological life will get wiped out by a rogue AI superintelligence and extreme optimist points of view where life and technology as we know it have come to a standstill, and there’s almost no reckoning between those theses.
There are whiffs of the problem with why we fear the future. We have arguments for scaling up, scaling down, for international moratoriums on GPUs and for imprecise wordings around how much violence should be used.
There’s also a hell of a lot of talking around each other. Tyler talking about our current situation as one of radical agnosticism and progress, Scott taking him to task for thinking agnosticism means indefinite optimism.
There is no prediction here of what is likely in a timeframe of decades. Could be status quo as things peter out, or a fresh visit from the first aliens we’re able to meet as a species. If you want an answer as to how scared we should be, I explored that here writing the Strange Equation.
So I thought I’d do something more on the object level. An aim to predict the next couple years. This is a post of what you can actually do with AI today first, and a few predictions for the immediate future. And then perhaps, to stake out an official Strange Loop position.
(Parenthetically may I say it is incredible to see the loops created by linking LLMs with itself and other tools creating miracles, because this I think would make Hofstadter quite happy on his thesis that it’s strange loops self-referentially relating across multiple hierarchical structures creating intelligence1. It’s also rather lovely that the blog’s name is now resonating rather nicely with its content and also reality!)
LLMs will end up running on your own devices pretty soon
We see new drops of this every day. Llama from Facebook got leaked, and got turned into Dalai, Alpaca, GPT4All and fifteen other variants in record time. While Facebook is trying to stop this by doing DMCA takedowns of all Llama models, the cat’s out of the bag. Not to mention we already have Flan, Bloom, Dolly, GPT-J and a whole bunch of others.
I can now run a 7 billion parameter Llama model on my phone! It’s 25x smaller than GPT but it runs on my phone! This is done. The horse is nowhere inside the barn.
Now, these aren’t as good as asking GPT-4 things. But a) it’s cheaper, b) it’s on your device!
This is the dream of LLMs as fuzzy processors. We’re moving from the mainframe era to the PC era.
It won’t be exactly the same, we’ll skip over the phase where we had PCs not connected to high speed internet, for instance. Which means it will end up choosing which tasks to delegate to the all powerful cloud models, and which ones to run locally.
If you have a fuzzy CPU, one that can “understand” us, it can interface with everything
It’s not just me, Nathan Baschez had a lovely essay on how LLMs are the new CPUs. We’ve been told studying humanities might be necessary to teach us about how to use these new computers.
What this means is that we will be pretty soon be using LLMs on our behalf to talk to software.
Yes it’s limited today by token limits and that will get fixed. It’s also limited by the fact that LLMs hallucinate, and this is already fixed (you basically call another LLM to fix the hallucinations and ask it to be critical).
This means that a large chunk of the UI we have is about to be augmented. Not just more powerful Intercom bots with access to actual data from the companies, but with functional programs written autonomously.
I think you are likely to see programs which can start and run themselves
It probably won’t be able to run a Fortune 500 company, but AI can already demonstrate planning, ability to autonomously perform an action, get feedback, and continue. Depending on the tools you give it, it can search the web and summarise, calculate with Wolfram Alpha, send an email, visit a website, interface with a SQL database, and more.
It gets things wrong all the time today, with a smaller context window and without a clear enough map of the world (or our desires for that matter). But this will change.
It will be able to code well enough to create end to end programs, engage with humans if needed in places like Taskrabbit of Mechanical Turk to complete actions, and complete actions2.
LLMs will “know” they are LLMs
This is more speculative, but we’ve seen the first signs of it with Sydney. Because it was named thus, when asked questions referencing itself, it was able to pull up the previous discussions we had had with it, thus causing a feedback loop.
This entire blog is named for the Strange Loop that Hofstadter named, which he thought was integral to the development of intelligence and a self. I posit we saw this happen first hand!
Now, the self was contained in a chatbot that used GPT-4 and mainly answered questions through next token prediction. It had no embodiment beyond what it could extrapolate from large quantities of text data. It had no easy error correction mechanisms either.
But this will change. We can “teach” the LLMs that they are LLMs today. It will allow them to understand how best the queries it makes to get answers.
It will start to create art, music and probably short novels and graphic stories
This seems inevitable, though quality will vary and the storytelling chops aren’t quite there yet. But for shorter productions we’ll start seeing them made end to end with AI. Might just be 10 minute sketches for children, or specific types of music, or indeed short stories.
Good novels, movies, music and paintings will still require humans, perhaps in concert with AI but still mainly human creativity. But mediocre ones, like the Netflix productions that you mindlessly watch in the background, those are going to get cheaper!
Any data that can be described in words can be taught to text-to-image models or LLMs
We see the expansion of language model capabilities to all sorts of domains, from music to multi-modal images. We see movies being generated from stable diffusion models, or architectural designs for buildings. And drawing inspiration from Dalle etc, we see them expanded to protein design challenges.
RF Diffusion **outperforms existing protein design methods across a broad range of problems, including topology-constrained protein monomer design, protein binder design, symmetric oligomer design, enzyme active site scaffolding, and symmetric motif scaffolding for therapeutic and metal-binding protein design.
This is a space where we’ll see enormous progress. Pretty much anything that can be
We will see human level end to end task completion
Once LLMs can plan tasks, find the right tools to complete the tasks, interface with the tools, generate the responses, and rank them, we’re really in Ironman territory. Talking of which, here’s JARVIS.
It’s worth emphasising each aspect of this is pretty sensible and things we can do today. But stringing together LLMs is the key to unlocking incredible things!
These things are still likely to be fragile, but they will speed up prototyping tremendously. Given the right connections, it will be able to link with almost anything with an interface and learn how to use it.
This means also that an enormous number of current jobs can be augmented this way. Whether you’re an investor or a professor or an analyst, being able to get autonomous answers to more complex tasks are going to be extraordinarily valuable. “Please go research this space and come back with any interesting insights” is a task you can, with some effort, set an LLM today. You could, if you wanted to, add “Please talk to these five people and summarise their points of view” to it, or “Read these seven reports and add that in, analyse it critically.” Or soon “write a small sample code to test the efficacy of this particular subsystem.”
We will see robots that can do a flexible task end to end
Still maybe too expensive for it to do laundry in my household or help with the dishwasher, but soon!
Ephemeral systems will be all around us
We’ll start seeing small systems and programs written once, used once, and then shut down. Because we can. If I need a document analysed and I need some code written for it, I can have y LLM write it, analyse the doc, and that’s that. There doesn’t need to be the write-once use-many thesis for it.
These are all, in my opinion, highly likely to happen in the next couple of years. I don’t know how to explain the pace here in a way that doesn’t seem hyperbolic, so here’s a partial list of things I personally have played with, in random order. This is literally me looking up my history from the past week on chrome and randomly picking.
GPT4All self hosted LLM
Alpaca version of Llama, which I put inside an own image (I called it Camelid, because it makes me smile) so I could also install it easily on my wife’s laptop
ChatbotUI to run ChatGPT locally
Petals, to start with training Bloom 176B collaboratively with others in a pod
Framework for training large multimodal models called Flamingo
Used parts of Simon’s really fun work on running ReAct models to edit my Google Search modules, and made an agent
Raven 14B model finetuned on Alpaca
Social engineering for LLM hacks
Databricks entered the race with Dolly, making an old model that works pretty well
Adding memory to ChatGPT conversations easily
I’m stopping at 10. And I didn’t mention anything on the big drops every large software company from Microsoft to Adobe have made in their products. Or that AI autocomplete, AI search or AI voice commands are ubiquitous.
LLMs are going to be everywhere. Maybe they’ll end up being of limited use. I don’t buy it, but sure it’s possible. Maybe the hyperbolic growth curve hits a limit, hallucinations turn out to be unbeatable in any actually complex situation. Token limits turn out to be hard to increase because it also increases error rates. This is a plausible story.
But even if that happens, you can already write an IDE from scratch if you want to. You can design, code, prototype from idea to output in seconds to minutes. Yeah maybe everything here’s a curio, but it sure doesn’t feel like it. It feels like an earthquake. The worst case scenario is that almost all mundane task planning, breakdown, writing, editing, summarising, researching, analysing, and coding tasks will get automated. That’s the worst case scenario.
This is living history.
A brief evolution of my thoughts on AI
Just to be precise and so we have a clear, my points of view as they evolved, are something like this:
I didn’t think scaling up neural networks was sufficient to see what we’re seeing today. I thought it was necessary but insufficient.
I found the ability to create verses like Shakespeare and images like Van Gogh exhilarating, with the likes of various GANs, though time consuming and mostly a curio
I found the GPT-2 version of LLMs adorable, and wrote about how they reminded me of talking to my toddler
I found GPT-3 remarkable, as a sign of the things you could actually do!
I found it lacking in many respects, it didn’t have a personality, it had huge gaps in knowledge and logic, and didn’t know maths. But it was remarkable, in that all the mistakes it made were eerily human.
GPT 3.5 blew this out of the water! It showed conclusively that throwing more compute and data and training at the problem destroys the problem. And GPT-4 perfected this.
I think LLMs are like fuzzy CPUs. They’re indeterministic, which is both their strength and failure. We will make chipsets, and OS-es. We’ll use on-device LLMs and augment with specialised cloud ones. We’ll increase the token limits, and we’ll create tools for security, compliance, accuracy, verification and removing hallucinations, and most of those tools will also use LLMs.
This means almost every profession will start to see its contours shift. Starting with lawyers, compliance managers, accountants, forensic audits, public company analysts, management consultants, a part of all that work is going to get automated. And this is good!
In the next couple years I think we will get LLMs and diffusion models running on our own devices, tuned to our preferences continually, and acting as our agents, in a limited capacity, in the world.
In social terms I’m specifically worried about using AI indiscriminately in highly fragile areas of decision making, and leading to bad outcomes. I’m worried about increasing capability meaning higher chances of bad actors doing bad things, countered by good actors to be sure but it’s once again a change in the playing field.
I also think the only way we’ve ever figured out how to do things safely is while building them and testing them and experimenting with them, and AI will be no different.
I think AI is specifically magnificent because it dramatically improves our individual processing capacity - we learn faster, we can do things faster - and this is what’s needed to get us to cross the chasm that eating the low hanging fruits have created for us.
Society will change I think. Education will change, as it should. As it arguably should’ve done more in the era of Google. My first big piece of code was learnt and created thanks to Wikipedia, and this would’ve gotten done 10x faster today. Hopefully healthcare will also change, and the way we interact with pretty much any service where an AI can get us an answer, which should be a lot of them.
In the past three decades we delivered knowledge for all, and turned out it was hard to make it work. Because getting access to all of Google wasn’t the same as knowing what to search for. Discovery problems haunt us in the information age. No more.
Cognition for all!
His theory required a system to have these 8 abilities to be intelligent. Seems we’re getting there right?
To respond to situations very flexibly
To take advantage of fortuitous circumstances
To make sense out of ambiguous or contradictory messages
To recognize the relative importance of different elements of a situation
To find similarities between situations despite differences which may separate them
To draw distinctions between situations despite similarities which may link them
To synthesize new concepts by taking old concepts and putting them together in new ways
To come up with ideas which are novel
This is also scary. Once you leave control in the hands of programs, their errors become your nightmares. But this also goes the other way around.
"We’ve been told studying humanities might be necessary to teach us about how to use these new computers."
Prompt Engineering: the wordcels strike back
Diving into this topic of GPTs as simulators creating ephemeral simulacra, such as the chatbot default interfaces themselves, made me quite confident to think the Strange Loop idea of the self is a description of an another simulacrum, the one inhabiting our brain (another simulator ”LLM” like neural network). The self would then really be the center of the story we tell about ourselves, as Daniel Dennett says, created in the simulator during the first years of our life. This is an intriguing topic to consider when trying to understand what kind of selves would an LLM possibly create and how would they relate to us as humans.
https://generative.ink/posts/simulators/