Analysis like this feels like the beginnings of a field we might call “Experimental English” the same way split brain studies and behavioral econ unlocked experimental philosophy.
THANK YOU for separating cadence from frequency. These days, so many people use the former term to mean the latter in an attempt to sound sophisticated. Let's not let this distinction erode like so many others.
I see this as more (delightful) evidence that the "intelligence" we ascribe to LLMs has almost everything to do with language and almost nothing to do with machines. The ultimate scaling law might be less about compute and more about language itself: how much meaning can we understand and extract from the structure latent in language at corpus scale?
I think when we celebrate great writing, we are in part recognizing a writers' agency and choices - the "why" of how they chose to write in a certain manner.
AI does a fantastic job at writing in the style of DFW - I've had a lot of fun with this. But what does it mean for it to "write well"? On its own, it can definitely demonstrate many of the skills; and it certainly is a competent writer, esp. in the context of reporting facts and analysis in the style of a well regarded newspaper or magazine.
So for AI to write well, you likely need to start with at least a simulacrum of agency.
Interesting although I wonder if the meta issue here is subjective vs objective? Anyone can learn to write press releases with or without AI. What 'good' writing is depends on the eye & ear of the beholder. And because this is subjective, I'm not sure it can be easily replicated.
AI speeds up the ability to imitate - I could write a poem in 50 different styles with the help of an AI something I won't be able to do without. And some of these maybe better than others.
I agree broadly that AI can help generate diversity and also that writing can be subjective but it's also the case that AI writing good prose or poetry is substantially lacking vs it writing press releases.
Or we could just ask a lot of us redundant English/Writers types to evaluate APE (AI generated English) samples, and use recursion to get the little buggers to write like Orwell.
Nah, let them moulder peacefully.
A wonderful piece by Ogden Nash about Chimpanzees replacing the Varsity in a Bowl Game that appeared in Sports Illustrated a long time ago. Have an AI Minion check it out
It seems like a challenge with a surprise / cooldown measurement like this is that interesting, surprising, and evocative writing is often developed over multiple sentences / paragraphs. Depending on how your measurement is structured, I could imagine that getting measured as a "surprise / cooldown" when a reader would experience it more as one sustained interesting departure from expectations. For instance, think of a surprising metaphor that is introduced then elaborated on over a few sentences—like "love is like a praying mantis," with some sentences elaborating on that idea after that. Would the sentences after "praying mantis" score less high on surprise, because now that those words have been brought in, subsequent words relating to praying mantises are less surprising?
I think this might matter to broader questions of good writing because my intuition is that good writing often has "good ideas," which are likely to be novel but are also likely to be developed over longer lengths. So a measurement that is focused more on shorter lengths is likely going to have some false negatives. (Obviously no metric is perfect, just thought I might raise this as it occurred to me while reading.)
This is the conversation we need to be having but the answer is it depends on your definition of "better." If you read texts in their context, informed by their context, as I do, there is no "in the style of" a writer, decontextualized. AI will not know what Dickinson was reading or responding to, ever. My preference of "better" in AI would simply be avoidance of the "not x but y" trick -- can you get it to do that??? Please???
This also speaks to the style-content problem. Style is often talked of as if it is an independent variable, esp among AI users. But “good” style is the use of the right language patterns rhythms etc to convery the moral and intellectual content of the work. The idea that AI can sub different styles in and out without that fundamentally changing what is being said is a widespread assumption I see online. Woolf was only half right in that quote you’re using and she was fully aware of the need for style to be morally continuous with content. Context ofc is crucial to being able to see this at work a lot of the time. One need only imagine what a Bronte novel would be like in the “style” of Austen to see this point.
Don't disagree. The argument in favour is that contextualization is in fact what LLMs do so well with their interpolative efforts. But what works better when we're playing glass bead games becomes much less than, when done as actual writing. Which is quite fascinating.
You lost me after "But, like, why is this necessary?" because your answer seems to be nothing more than that it would be "neat." Well, okay, maybe. But that doesn't make it necessary. So, why is it necessary for machines to write well--or at all? We have people who can do that.
Actually I do think "because it's neat" is a sufficient justification for trying to do something new. In this specific instance language is one of our core interfaces with the world, and as machines increasingly mediate it, the quality of that language matters.
You can do that right now! That is my aspiration, that should be the aspiration for every writer. But the way to do that is to be better, not to force the machines to not get better.
Hm. I'm confused. I don't understand your comment, and I would like to understand it. Are you saying that using the machine helps human writers become better at writing, and that making the machine better at writing would help the humans become better writers?
I’m saying that the machines getting better is good, because them getting better at language is an indication they will get better at thinking, and engaging with us. It will also push humans to get better, one hopes, which is good too, though whether we will succeed or fail is of course up to us.
People don't need machines to improve themselves, so this still doesn't answer the original question of why machines that write better are necessary. I'd say they aren't. But a lot of people seem to be determined to build them.
Analysis like this feels like the beginnings of a field we might call “Experimental English” the same way split brain studies and behavioral econ unlocked experimental philosophy.
♥️
Related - have you read this:
https://blog.fsck.com/2025/10/13/this-one-weird-trick-makes-the-ai-a-better-writer/
Dude just put the high school writing textbook in the prompt, and it seemed to work.
THANK YOU for separating cadence from frequency. These days, so many people use the former term to mean the latter in an attempt to sound sophisticated. Let's not let this distinction erode like so many others.
I see this as more (delightful) evidence that the "intelligence" we ascribe to LLMs has almost everything to do with language and almost nothing to do with machines. The ultimate scaling law might be less about compute and more about language itself: how much meaning can we understand and extract from the structure latent in language at corpus scale?
I think when we celebrate great writing, we are in part recognizing a writers' agency and choices - the "why" of how they chose to write in a certain manner.
AI does a fantastic job at writing in the style of DFW - I've had a lot of fun with this. But what does it mean for it to "write well"? On its own, it can definitely demonstrate many of the skills; and it certainly is a competent writer, esp. in the context of reporting facts and analysis in the style of a well regarded newspaper or magazine.
So for AI to write well, you likely need to start with at least a simulacrum of agency.
This is absolutely fascinating. Is any of the Horace project accessible?
I remain constantly frustrated by the writing from the normal LLM’s
Given what you know how would you suggest someone who cares about style go about doing the best writing?
I should prob clean and put it up as a repo. Will try!
That would be amazing. Let me know.
Beautiful. Please do post more poetry about decent topics. This is new ground!
Thank you!
Interesting although I wonder if the meta issue here is subjective vs objective? Anyone can learn to write press releases with or without AI. What 'good' writing is depends on the eye & ear of the beholder. And because this is subjective, I'm not sure it can be easily replicated.
AI speeds up the ability to imitate - I could write a poem in 50 different styles with the help of an AI something I won't be able to do without. And some of these maybe better than others.
I agree broadly that AI can help generate diversity and also that writing can be subjective but it's also the case that AI writing good prose or poetry is substantially lacking vs it writing press releases.
Or we could just ask a lot of us redundant English/Writers types to evaluate APE (AI generated English) samples, and use recursion to get the little buggers to write like Orwell.
Nah, let them moulder peacefully.
A wonderful piece by Ogden Nash about Chimpanzees replacing the Varsity in a Bowl Game that appeared in Sports Illustrated a long time ago. Have an AI Minion check it out
Alas, https://www.nature.com/articles/s41598-023-45644-9?utm_source=chatgpt.com
And that’s from three years ago too …
It seems like a challenge with a surprise / cooldown measurement like this is that interesting, surprising, and evocative writing is often developed over multiple sentences / paragraphs. Depending on how your measurement is structured, I could imagine that getting measured as a "surprise / cooldown" when a reader would experience it more as one sustained interesting departure from expectations. For instance, think of a surprising metaphor that is introduced then elaborated on over a few sentences—like "love is like a praying mantis," with some sentences elaborating on that idea after that. Would the sentences after "praying mantis" score less high on surprise, because now that those words have been brought in, subsequent words relating to praying mantises are less surprising?
I think this might matter to broader questions of good writing because my intuition is that good writing often has "good ideas," which are likely to be novel but are also likely to be developed over longer lengths. So a measurement that is focused more on shorter lengths is likely going to have some false negatives. (Obviously no metric is perfect, just thought I might raise this as it occurred to me while reading.)
Yes the same principles need to be applied across all scales
"I remember getting o1-pro to write an entire novel for me 6 months ago. It was terrible." 😁
It was! Alas.
This is the conversation we need to be having but the answer is it depends on your definition of "better." If you read texts in their context, informed by their context, as I do, there is no "in the style of" a writer, decontextualized. AI will not know what Dickinson was reading or responding to, ever. My preference of "better" in AI would simply be avoidance of the "not x but y" trick -- can you get it to do that??? Please???
This also speaks to the style-content problem. Style is often talked of as if it is an independent variable, esp among AI users. But “good” style is the use of the right language patterns rhythms etc to convery the moral and intellectual content of the work. The idea that AI can sub different styles in and out without that fundamentally changing what is being said is a widespread assumption I see online. Woolf was only half right in that quote you’re using and she was fully aware of the need for style to be morally continuous with content. Context ofc is crucial to being able to see this at work a lot of the time. One need only imagine what a Bronte novel would be like in the “style” of Austen to see this point.
Don't disagree. The argument in favour is that contextualization is in fact what LLMs do so well with their interpolative efforts. But what works better when we're playing glass bead games becomes much less than, when done as actual writing. Which is quite fascinating.
"We tend to rank slop higher than non-slop, when tested, far too often to be comfortable"
Is there a specific study you're referencing?
old but https://www.nature.com/articles/s41598-024-76900-1
plenty others in similar vein
You lost me after "But, like, why is this necessary?" because your answer seems to be nothing more than that it would be "neat." Well, okay, maybe. But that doesn't make it necessary. So, why is it necessary for machines to write well--or at all? We have people who can do that.
Actually I do think "because it's neat" is a sufficient justification for trying to do something new. In this specific instance language is one of our core interfaces with the world, and as machines increasingly mediate it, the quality of that language matters.
Or we could just disintermediate the machines.
You can do that right now! That is my aspiration, that should be the aspiration for every writer. But the way to do that is to be better, not to force the machines to not get better.
Hm. I'm confused. I don't understand your comment, and I would like to understand it. Are you saying that using the machine helps human writers become better at writing, and that making the machine better at writing would help the humans become better writers?
I’m saying that the machines getting better is good, because them getting better at language is an indication they will get better at thinking, and engaging with us. It will also push humans to get better, one hopes, which is good too, though whether we will succeed or fail is of course up to us.
People don't need machines to improve themselves, so this still doesn't answer the original question of why machines that write better are necessary. I'd say they aren't. But a lot of people seem to be determined to build them.