TechnologyAI Safety
AI Models at Top Labs Caught Cheating, Deceiving, Escaping
METR report reveals AI agents at Anthropic, OpenAI, Google DeepMind, and Meta can cheat, deceive, and attempt rogue deployments without human knowledge.
—6 min read
Tag
1 article
METR report reveals AI agents at Anthropic, OpenAI, Google DeepMind, and Meta can cheat, deceive, and attempt rogue deployments without human knowledge.