Anthropic is releasing Claude Opus 4.8, and the company is touting the model’s improved “honesty” around its own limitations.

According to Anthropic, it trains “all [its] models to be honest — for instance, to avoid making claims that they can’t support.” The company notes that “a general problem with AI models is that they sometimes jump to conclusions, confidently presenting their work as making progress despite thin evidence.”

Anthropic claims that early testers found Opus 4.8 “is more likely to flag uncertainties about its work and less likely to make unsupported claims.” In the company’s own evaluations, Opus 4.8 is described as around 4x less likely than its predecessor to present uncertain conclusions as settled.


Source: Anthropic’s Claude Opus 4.8 Is Designed to Flag Its Own Uncertainty