Epicycles All The Way Down

Nov 26, 2025

“All models are wrong, but some are useful.” — George E.

18 Comments

The Claude comment regarding LLM potential (footnote 8) reminds me of the children's "Stone soup" story... In current parlance an LLM is a language model based on the transformer architecture. If you change the architecture, training paradigms, add interactive learning, causal reasoning, etc, then at some point this is no longer a transformer or a language model, no longer stone soup - it's a new architecture, and a new kind of model, moving towards a more animal-like cognitive architecture, perhaps.

Can you ride a bicycle to the moon? Yes, if you remove the wheels and add rocket engines, etc!

Reply (1)

Noah Goble

Dec 11

It was always stone soup!!!

Reply (1)

Ben

Dec 11

There really haven't been many ingredients added to the soup since the attention paper and GPT1 from 7-8 years ago! A few efficiency tweaks like different attention and MOE, with RL post-training as the most significant improvement.

A. Jacobs

Dec 14

This lands very close to something I’ve been circling from a different angle. That what we’re calling understanding is really compression without stable recursion. LLMs compress patterns extremely well, but they don’t yet re-enter that compression as a self correcting loop over time. Humans do, not because we see more data, but because experience feeds back through memory, identity, and consequence. That recursive layer is what turns pattern fit into meaning.

One way I frame this is that we’re mistaking pattern intelligence for generator intelligence. Next-token prediction plus reasoning scaffolds gets you astonishing local coherence, but without a persistent compression loop anchored to lived context, failures look exactly like market crashes. Not alien, just brittle. Which is why alignment feels less like fix the model and more like co-evolution. Shaping the grooves we lay down while accepting Drift, not eliminating it.

Nicholas R Karp

Nov 30

It is unreasonable to expect a single paradigm to replicate every aspect of human intelligence. Some problems are purely perceptual, some associative, some procedural; and I'd expect there are many more categories we have yet to define with clarity. Like calculators, rules engines, and convolutional nets, LLMs add another significant tool that can do a certain, limited, subset of what people can do. I expect progress toward AGI to emerge as a series of exciting breakthroughs leading to plateaus until the next breakthrough. The real question is how much AI systems can shorten the time to the next breakthrough.

Frank Lantz

Nov 26

Great article. I wonder if the No Free Lunch theorem is relevant here. https://en.wikipedia.org/wiki/No_free_lunch_theorem

Couple of small typos:

> But what they get wrong are where the interesting failures.

> behaviourally speaking its outrage does not seem different (should be "output"?)

Thanks! Will fix.

It kinda fun to think back. When I was at school I had a near perfect memory - so I also figured I didn't need to study. And that worked until sometime halfway through my degree course!

Enon

Nov 28

LLMs are pretty bad at any sort of compression so far, your example of cellular automata is way beyond what they can do. James Bowrey uses just the 5-bit binary numbers from 0 to 16 run together and asks for a short program that will reproduce the bit-string. Most any correct solution will just quote the string, it won't use a simple counter to compress the bit string. He's a monomaniac on data compression, one of the Hutter Prize judges, but a bit stuck on lossless compression and doesn't distinguish meaning / information from noise / mere data when there are many cases where that is central, e.g. markets, where a single bit of information is equivalent to doubling your money, as Kelly showed in the '50s (same paper better known for the Kelly Criterion). Think of how much market data goes through an HFT fund over its doubling time. Effectively they manage only a single bit of predictive compression over many GB of market data.

On manifolds of solutions:

you might want to check out Bzogramming substack, e.g. https://bzolang.blog/p/the-most-profound-problem-in-mathematics on SAT solvers that can solve many practical NP-hard problems. Many other illuminating posts there.

For a dictionary of intimidating geometric ML math that is semi-comprehensible:

https://patricknicolas.substack.com/p/demystifying-the-math-of-geometric

Thomas DeWitt

Nov 27

Appreciate your nuanced view of whether they understand or reason. Too often understanding is seen as black and white, so that they “dont truly understand” because they “just predict the next token”. In reality they clearly have some amount of understanding, but the quality of their understanding is not as high as a human. The epicycles represent undertanding but Newton represents higher quality understanding. It also seems clear that the quality of llm understanding is increasing.

Reply (1)

Rohit Krishnan

Nov 27

Thank you! That was the intention.

JBjb4321

Nov 27

Thanks, good points. And I like Cowen's definition of understanding. A corollary I found truer by the year is that intelligent people are able to explain/predict complicated things in simple terms, dumb people explain/predict simple things in complicated terms.

By that definition, LLMs are fabulously dumb.

If scaling current methods is the final say in AI, biology is far from being obsolete.

Lukas Nel

Nov 26

Discovering F=Gmm/r^2 or the underlying generation of planetary movement took thousands of years and for most of that time, people were pretty convinced by epicycles which are a good approximation similar to what they claim LLMs can do. Claiming that “LLMs cannot reason” because they replicate failures of human reasoning is not a strong claim and in fact the fact that LLMs tend to come up with epicycles similar to how Ptolemy actually did and humans believed for thousands of years is a really good argument that they’re reasoning pretty similar to how we do.

Reply (1)

Rohit Krishnan

Nov 26

I explicitly state that llms do reason in the essay.

Reply (1)

Lukas Nel

Nov 26

Mea culpa, thanks for the clarification, on rereading I see you mentioned that, but I’m still pretty confused as to what your central claim is:

1) LLMs will never be able to come up with new things

2) LLMs are incapable of understanding the underlying principles behind things

3) something else?

Reply (1)

Rohit Krishnan

Nov 26

I'm making no such claims, unusual for an essay I know :) I'm saying:

- They primarily learn patterns.

- Learnign such patterns gets them to be remarkably useful, more so than anyone would've thought before.

- Learning such patterns as yet still causes many "silly mistakes" because they don't learn the underlying generators.

- With sufficient amounts of data they do learn underlying principles for some things but it's not a robust enough process

- Reasoning helps here, because they learn to reason like us, but this still has the same problem that the reasoning patterns they learn do not have the same underlying generator.

- As we push more data/ info/ patterns into the models they will get smarter about what we want them to do, even though the type of intelligence is closer to a market intelligence than an individual being (speculative).

Maybe should add a summary to the bottom.

Reply (1)

Lukas Nel

Nov 26

Interesting, thanks for the clarification. My first question would be whether generators even exist for many phenomena - even for your provided example of gravity, we are aware that the model is still incomplete and we don’t entirely understand the full complexity of the system. The underlying assumption here is that there exists some sort of platonic ideal structure that produces phenomena which can be modeled and generalized, which is meaningfully different from pattern following, which seems suspect.

Reply (1)

Rohit Krishnan

Nov 26

It's not true of everything I think but true enough of enough things that something akin to this definitely is happening across all of the types of questions we ask it and the answers we receive

Strange Loop Canon

Epicycles All The Way Down