Discussion about this post

User's avatar
Mike Randolph's avatar

“LLMs ability to judge their own work remains remarkably bad” — that’s the load-bearing sentence in the whole post. Three years working the same problem from a different angle and the same finding keeps surfacing.

I’m starting to think this isn’t just an LLM problem. Any system judging from inside its own work has a built-in blind spot; LLMs just make the failure unusually easy to see.

Victualis's avatar

I think for a PhD you now need to do a lot more: stuff that would take years is now a few prompts away. In 2021 this was a PhD, now it's one small part of what's expected. But being able to, outside the academy, investigate in non-trivial detail any domain based on available data is exciting. Thanks for sharing your inspiring experience!

3 more comments...

No posts

Ready for more?