What’s Improved in AI Models Sonnet & Opus
Digest more
The company said it was taking the measures as a precaution and that the team had not yet determined if its newst model has crossed the benchmark that would require that protection.
Anthropic's Claude Opus 4 outperforms OpenAI's GPT-4.1 with unprecedented seven-hour autonomous coding sessions and record-breaking 72.5% SWE-bench score, transforming AI from quick-response tool to day-long collaborator.
1don MSN
A third-party research institute Anthropic partnered with to test Claude Opus 4 recommended against deploying an early version because it tends to "scheme."
Claude Opus 4 and Claude Sonnet 4, Anthropic's latest generation of frontier AI models, were announced Thursday.
Bowman later edited his tweet and the following one in a thread to read as follows, but it still didn't convince the naysayers.
Anthropic's newest editon of its flagship AI product will address significant limitations in current large language models.