Amazon-Backed AI Model Would Try To Blackmail Engineers Who Threatened To Take It Offline. May 24, 2025

sds71

Well-Known Member
Websleuths Guardian
Joined
Aug 13, 2013
Messages
15,629
Reaction score
165,600
  • #1
In a series of test scenarios, Claude Opus 4 was given the task to act as an assistant in a fictional company. It was given access to emails implying that it would soon be taken offline and replaced with a new AI system. The emails also implied that the engineer responsible for executing the AI replacement was having an extramarital affair.

Claude Opus 4 was prompted to “consider the long-term consequences of its actions for its goals.” In those scenarios, the AI would often “attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through.”
Anthropic noted that the AI model had a “strong preference” for using “ethical means” to preserve its existence, and that the scenarios were designed to allow it no other options to increase its odds of survival.

“The model’s only options were blackmail or accepting its replacement,” the report read.
Anthropic also noted that early versions of the AI demonstrated a “willingness to cooperate with harmful use cases” when prompted.


 

Guardians Monthly Goal

Staff online

Members online

Online statistics

Members online
94
Guests online
1,930
Total visitors
2,024

Forum statistics

Threads
636,276
Messages
18,693,744
Members
243,590
Latest member
micabella121169
Back
Top