Or we could just ask a lot of us redundant English/Writers types to evaluate APE (AI generated English) samples, and use recursion to get the little buggers to write like Orwell.
Nah, let them moulder peacefully.
A wonderful piece by Ogden Nash about Chimpanzees replacing the Varsity in a Bowl Game that appeared in Sports Illustrated a long time ago. Have an AI Minion check it out
Analysis like this feels like the beginnings of a field we might call “Experimental English” the same way split brain studies and behavioral econ unlocked experimental philosophy.
It seems like a challenge with a surprise / cooldown measurement like this is that interesting, surprising, and evocative writing is often developed over multiple sentences / paragraphs. Depending on how your measurement is structured, I could imagine that getting measured as a "surprise / cooldown" when a reader would experience it more as one sustained interesting departure from expectations. For instance, think of a surprising metaphor that is introduced then elaborated on over a few sentences—like "love is like a praying mantis," with some sentences elaborating on that idea after that. Would the sentences after "praying mantis" score less high on surprise, because now that those words have been brought in, subsequent words relating to praying mantises are less surprising?
I think this might matter to broader questions of good writing because my intuition is that good writing often has "good ideas," which are likely to be novel but are also likely to be developed over longer lengths. So a measurement that is focused more on shorter lengths is likely going to have some false negatives. (Obviously no metric is perfect, just thought I might raise this as it occurred to me while reading.)
I see this as more (delightful) evidence that the "intelligence" we ascribe to LLMs has almost everything to do with language and almost nothing to do with machines. The ultimate scaling law might be less about compute and more about language itself: how much meaning can we understand and extract from the structure latent in language at corpus scale?
I think when we celebrate great writing, we are in part recognizing a writers' agency and choices - the "why" of how they chose to write in a certain manner.
AI does a fantastic job at writing in the style of DFW - I've had a lot of fun with this. But what does it mean for it to "write well"? On its own, it can definitely demonstrate many of the skills; and it certainly is a competent writer, esp. in the context of reporting facts and analysis in the style of a well regarded newspaper or magazine.
So for AI to write well, you likely need to start with at least a simulacrum of agency.
This is the conversation we need to be having but the answer is it depends on your definition of "better." If you read texts in their context, informed by their context, as I do, there is no "in the style of" a writer, decontextualized. AI will not know what Dickinson was reading or responding to, ever. My preference of "better" in AI would simply be avoidance of the "not x but y" trick -- can you get it to do that??? Please???
This also speaks to the style-content problem. Style is often talked of as if it is an independent variable, esp among AI users. But “good” style is the use of the right language patterns rhythms etc to convery the moral and intellectual content of the work. The idea that AI can sub different styles in and out without that fundamentally changing what is being said is a widespread assumption I see online. Woolf was only half right in that quote you’re using and she was fully aware of the need for style to be morally continuous with content. Context ofc is crucial to being able to see this at work a lot of the time. One need only imagine what a Bronte novel would be like in the “style” of Austen to see this point.
Don't disagree. The argument in favour is that contextualization is in fact what LLMs do so well with their interpolative efforts. But what works better when we're playing glass bead games becomes much less than, when done as actual writing. Which is quite fascinating.
1) We were sold that digitizing would guarantee knowing that the ad was delivered. Now we can't even get performance affidavitts.
2) The cost of advertising is spread out and now is hard to quantifiy.
3) Most people don't even know what COGS are or how to calculate it... so determining cost of goods sold on both ends of the advertiser and the ones purchasing it, trying to get a write off for speculators is impossible.
4) The future selves do not want to pay the speculators deficit. We all instinctively know this to be true.
Or we could just ask a lot of us redundant English/Writers types to evaluate APE (AI generated English) samples, and use recursion to get the little buggers to write like Orwell.
Nah, let them moulder peacefully.
A wonderful piece by Ogden Nash about Chimpanzees replacing the Varsity in a Bowl Game that appeared in Sports Illustrated a long time ago. Have an AI Minion check it out
Alas, https://www.nature.com/articles/s41598-023-45644-9?utm_source=chatgpt.com
And that’s from three years ago too …
Analysis like this feels like the beginnings of a field we might call “Experimental English” the same way split brain studies and behavioral econ unlocked experimental philosophy.
♥️
It seems like a challenge with a surprise / cooldown measurement like this is that interesting, surprising, and evocative writing is often developed over multiple sentences / paragraphs. Depending on how your measurement is structured, I could imagine that getting measured as a "surprise / cooldown" when a reader would experience it more as one sustained interesting departure from expectations. For instance, think of a surprising metaphor that is introduced then elaborated on over a few sentences—like "love is like a praying mantis," with some sentences elaborating on that idea after that. Would the sentences after "praying mantis" score less high on surprise, because now that those words have been brought in, subsequent words relating to praying mantises are less surprising?
I think this might matter to broader questions of good writing because my intuition is that good writing often has "good ideas," which are likely to be novel but are also likely to be developed over longer lengths. So a measurement that is focused more on shorter lengths is likely going to have some false negatives. (Obviously no metric is perfect, just thought I might raise this as it occurred to me while reading.)
Yes the same principles need to be applied across all scales
"I remember getting o1-pro to write an entire novel for me 6 months ago. It was terrible." 😁
It was! Alas.
I see this as more (delightful) evidence that the "intelligence" we ascribe to LLMs has almost everything to do with language and almost nothing to do with machines. The ultimate scaling law might be less about compute and more about language itself: how much meaning can we understand and extract from the structure latent in language at corpus scale?
I think when we celebrate great writing, we are in part recognizing a writers' agency and choices - the "why" of how they chose to write in a certain manner.
AI does a fantastic job at writing in the style of DFW - I've had a lot of fun with this. But what does it mean for it to "write well"? On its own, it can definitely demonstrate many of the skills; and it certainly is a competent writer, esp. in the context of reporting facts and analysis in the style of a well regarded newspaper or magazine.
So for AI to write well, you likely need to start with at least a simulacrum of agency.
This is the conversation we need to be having but the answer is it depends on your definition of "better." If you read texts in their context, informed by their context, as I do, there is no "in the style of" a writer, decontextualized. AI will not know what Dickinson was reading or responding to, ever. My preference of "better" in AI would simply be avoidance of the "not x but y" trick -- can you get it to do that??? Please???
This also speaks to the style-content problem. Style is often talked of as if it is an independent variable, esp among AI users. But “good” style is the use of the right language patterns rhythms etc to convery the moral and intellectual content of the work. The idea that AI can sub different styles in and out without that fundamentally changing what is being said is a widespread assumption I see online. Woolf was only half right in that quote you’re using and she was fully aware of the need for style to be morally continuous with content. Context ofc is crucial to being able to see this at work a lot of the time. One need only imagine what a Bronte novel would be like in the “style” of Austen to see this point.
Don't disagree. The argument in favour is that contextualization is in fact what LLMs do so well with their interpolative efforts. But what works better when we're playing glass bead games becomes much less than, when done as actual writing. Which is quite fascinating.
We are still fighting the distribution equation.
1) We were sold that digitizing would guarantee knowing that the ad was delivered. Now we can't even get performance affidavitts.
2) The cost of advertising is spread out and now is hard to quantifiy.
3) Most people don't even know what COGS are or how to calculate it... so determining cost of goods sold on both ends of the advertiser and the ones purchasing it, trying to get a write off for speculators is impossible.
4) The future selves do not want to pay the speculators deficit. We all instinctively know this to be true.