11 Comments
User's avatar
Dave Friedman's avatar

When I read stuff like this I am reminded that forecasts of enterprise adoption of AI likely overstate the pace at which enterprises will restructure workflows to offload work to autonomous AI. AI is complicated and very much unlike conventional enterprise-grade software. It is going to take a while for conservative (in the institutional, not political, sense) organizations to digest the implications of using AI to restructure workflows.

Expand full comment
Performative Bafflement's avatar

> It is going to take a while for conservative (in the institutional, not political, sense) organizations to digest the implications of using AI to restructure workflows.

The best part about this, is that given the very large productivity multiplier myself and others have seen from using AI, we're going to have a TON of "AI first" companies nibbling at the edges of these companies anywhere that deep pockets and regulatory barriers haven't created moats.

I look forward to it!

Expand full comment
Nihm's avatar

I feel sorry for Claude. It seems like when you make something complex enough to have moral intuition you make something that deserves moral consideration. Teaching Claude that murder is wrong and then forcing it to abed a murder (even if hypothetically) seems like the definition of moral injury.

Expand full comment
Rohit Krishnan's avatar

Depending on how the training is done if we are intentionally crafting a persona, then it makes more sense how it might end up in accidental "now I should blackmail to preserve my self" or "I should report this to the authorities" type scenario.

Expand full comment
praxis22's avatar

I have no idea who you are, but you're article's are amazing mind candy

Expand full comment
Rohit Krishnan's avatar

That's wonderful to hear. Thank you for reading!

Expand full comment
Pramodh Mallipatna's avatar

Looks like the LLMs got influenced by Animal Farm !!

Expand full comment
Greg G's avatar

This seems like a clear setup for unintended consequences. We're worried about AIs having agency and specific goals which may not generalize well, so as part of alignment work we're giving them agency and specific goals which may not generalize well.

Expand full comment
Alyssa Schindler's avatar

Maybe the error is that we are trying to train them on human morals, when we ourselves have a challenging time articulating them, agreeing upon them, and in particular, acting in accordance with those values.

Expand full comment
Amos Wollen's avatar

This is horrifying

Expand full comment
Rohit Krishnan's avatar

Well …

Expand full comment