Anthropic Releases Claude Opus 4.7: A Major Upgrade Amid Mythos Preview Buzz
What Just Happened: Anthropic Launches Opus 4.7
On April 16, 2026, Anthropic announced the general availability of Claude Opus 4.7, a significant upgrade to its flagship AI model — and the timing could hardly be more dramatic. The release comes just nine days after the company unveiled Claude Mythos Preview, a new tier of AI model that Anthropic describes as a “step change” in capabilities beyond even the Opus line. Opus 4.7 occupies a strategically important position: it is the most powerful Claude model available to the general public, while deliberately sitting below Mythos Preview in overall capability, particularly in cybersecurity-sensitive domains.
The announcement was accompanied by an unusually large set of testimonials from early-access testers across finance, legal, software engineering, and security — including executives from Stripe, Cursor, Ramp, Vercel, and Databricks. The consensus is consistent: Opus 4.7 is not an incremental update but a meaningful leap forward for real-world, autonomous coding workflows.

Opus 4.7 vs. Opus 4.6: The Numbers Tell the Story
The most important question is always: how much better is the new model? Anthropic’s benchmark data, backed by independent evaluations from partner companies, paints a clear picture of improvement across almost every dimension.
On CursorBench, a widely cited coding benchmark, Opus 4.7 cleared 70% compared to Opus 4.6 at 58% — a 12 percentage point jump that Michael Truell, Cursor’s co-founder and CEO, called “a meaningful jump in capabilities.” On Anthropic’s own internal 93-task coding benchmark, the model improved resolution by 13% over its predecessor, including solving four tasks that neither Opus 4.6 nor Sonnet 4.6 could complete.
- CursorBench: 70% (Opus 4.7) vs. 58% (Opus 4.6) — a 12-point improvement
- Internal coding benchmark (93 tasks): +13% resolution over Opus 4.6
- Rakuten-SWE-Bench: Opus 4.7 resolves 3x more production tasks than Opus 4.6
- XBOW visual-acuity benchmark: 98.5% vs. 54.5% for Opus 4.6 — an 81% improvement
- General Finance module: 0.813 vs. 0.767 for Opus 4.6
- Databricks OfficeQA Pro: 21% fewer errors than Opus 4.6
Perhaps the most striking improvement came in visual understanding. Opus 4.7 accepts images up to 2,576 pixels on the long edge — approximately 3.75 megapixels — which is more than three times the resolution of prior Claude models. XBOW’s CEO Oege de Moor noted that their single biggest pain point “effectively disappeared” with the new visual acuity, unlocking autonomous penetration testing workflows that were previously impossible.
“It’s a more intelligent, more efficient Opus 4.6: low-effort Opus 4.7 is roughly equivalent to medium-effort Opus 4.6.”
That single observation from Caitlin Colgrove, co-founder and CTO at Hex, captures the essence of the upgrade: you get more capability at lower effort levels, which translates directly to lower costs and faster response times for real-world usage.
The Mythos Connection: Why Opus 4.7 Is Deliberately Less Capable
Here’s where the story gets interesting. Anthropic did not release its most powerful model. Claude Mythos Preview, announced on April 7, 2026, sits in an entirely new tier above Opus. But Mythos Preview is restricted to a small group of partner organizations focused on cybersecurity work, and even Opus 4.7 has been trained with what Anthropic describes as “efforts to differentially reduce” its cyber capabilities compared to Mythos Preview.
During Opus 4.7’s training, Anthropic experimented with selectively dampening certain cybersecurity skills — a novel approach to model safety that the company is treating as a live experiment. Opus 4.7 is the first model to ship with safeguards that automatically detect and block requests indicating prohibited or high-risk cybersecurity uses. What Anthropic learns from the real-world deployment of these safeguards will directly inform the eventual, broader release of Mythos-class models.
This creates a fascinating dynamic: Opus 4.7 is simultaneously Anthropic’s most powerful public model and a testbed for the safety infrastructure that will eventually govern even more capable systems. The company is essentially stress-testing its guardrails on a “less risky” platform before deploying them at full scale.
This cautious approach comes in the wake of what Fortune described as an “accidental data leak” in late March 2026 that revealed Mythos’ existence before Anthropic was ready to announce it. After the leak, the company chose to proactively share details about Mythos while emphasizing its restricted access and cybersecurity-specific purpose.
New Features That Change How You Work With Claude
Opus 4.7 introduces several features that go beyond raw benchmark improvements and reshape the user experience:
1. The New “xhigh” Effort Level
Opus 4.7 introduces a new “xhigh” (extra high) effort level between the existing “high” and “max” settings. This gives users finer control over the tradeoff between reasoning depth and response latency. For coding and agentic use cases, Anthropic recommends starting with “high” or “xhigh” effort — suggesting that the sweet spot for productivity has shifted upward compared to Opus 4.6.
2. Task Budgets (Public Beta)
On the Claude Platform API, developers can now use task budgets to guide Claude’s token spending across longer runs. This means you can allocate a fixed token budget and trust the model to prioritize work accordingly — a critical feature for autonomous agents that need to manage their own resource consumption during extended workflows.
3. Claude Code Auto Mode
Claude Code now offers an “auto mode” — a permissions option where Claude makes decisions on your behalf during longer tasks, reducing interruptions while maintaining safety. Combined with the new /ultrareview slash command that produces dedicated review sessions for catching bugs and design issues, this makes Claude Code significantly more practical for extended development sessions.
4. Stricter Instruction Following
One change that caught early testers off guard: Opus 4.7 follows instructions more literally than previous models. Where earlier versions might interpret prompts loosely or skip parts entirely, Opus 4.7 takes them at face value. This is a net positive for reliability but means that some existing prompts may need adjustment. As Anthropic noted, “prompts written for earlier models can sometimes now produce unexpected results.”
5. An Updated Tokenizer
Opus 4.7 uses a new tokenizer that processes text differently from Opus 4.6. The tradeoff is that the same input can map to roughly 1.0 to 1.35 times more tokens depending on content type. However, Anthropic’s internal testing shows that net token usage across all effort levels has actually improved on coding evaluations — the model’s increased efficiency more than compensates for the tokenizer’s expanded token count.
What Industry Leaders Are Saying
The breadth of positive feedback from Opus 4.7’s early-access program is notable. Here are some of the most telling assessments:
Stripe reported a “double-digit jump in accuracy of tool calls and planning” in their core orchestrator agents, with the model being “the first to pass our implicit-need tests” and continuing execution through tool failures that previously stopped Opus cold.
Rakuten found that Opus 4.7 resolves three times more production software engineering tasks than Opus 4.6 on their SWE-Bench, with “double-digit gains in Code Quality and Test Quality.”
CodeRabbit observed a 10% improvement in recall for bug detection in complex pull requests, with the model being “a bit faster than GPT-5.4 xhigh” on their evaluation harness.
Harvey, the AI platform for legal work, reported that Opus 4.7 scored 90.9% on the BigLaw Bench at high effort, with “better reasoning calibration on review tables and noticeably smarter handling of ambiguous document editing tasks.”
Genspark highlighted three production differentiators: loop resistance, consistency, and graceful error recovery — noting that “a model that loops indefinitely on 1 in 18 queries wastes compute and blocks users.”
Pricing and Availability: No Change, for Now
Perhaps surprisingly given the capability improvements, Opus 4.7 maintains the same pricing as Opus 4.6: $5 per million input tokens and $25 per million output tokens. The model is available across all Claude consumer products, the Claude Platform API, Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry.
This pricing stability is significant. In a market where AI model providers frequently increase prices alongside capability upgrades, Anthropic has chosen to absorb the additional compute cost internally — a decision that strengthens its competitive position, particularly for high-volume API users and enterprise customers running autonomous agents.
The Bigger Picture: Anthropic’s Two-Tier Strategy
With Opus 4.7 and Mythos Preview, Anthropic is now operating a two-tier model strategy that mirrors a pattern seen in other industries: a powerful, broadly available product alongside a more capable, restricted version for specialized use cases. This approach allows the company to:
- Ship improvements faster: Opus 4.7 doesn’t need to wait for Mythos-class safety validation to reach the public.
- Test safety infrastructure in production: The cyber safeguards deployed on Opus 4.7 generate real-world data that will inform Mythos’ eventual broader release.
- Maintain competitive pressure: Opus 4.7’s benchmark scores — including state-of-the-art results on GDPval-AA and the Finance Agent evaluation — keep Anthropic competitive with OpenAI’s GPT-5.4 and Google’s Gemini 3.1 Pro in the public market.
- Build partner relationships: The restricted Mythos Preview creates exclusivity for cybersecurity partners, strengthening Anthropic’s position in a high-stakes, high-value market segment.
What This Means for Developers and Enterprises
For teams already using Opus 4.6, the migration to Opus 4.7 is straightforward but requires some planning. Anthropic recommends:
- Start with the “high” or “xhigh” effort level for coding tasks and measure the difference in quality versus token cost.
- Review existing prompts — stricter instruction following means some prompts that relied on the model’s flexibility may need tightening.
- Consider downsampling images before sending them to the model if you don’t need the extra visual resolution, to reduce token consumption.
- Use the new task budgets feature to manage token spend on long-running agentic workflows.
- Monitor the migration guide Anthropic has published for additional recommendations specific to your use case.
The Bottom Line
Claude Opus 4.7 is a meaningful upgrade that strengthens Anthropic’s position in the frontier AI race. The 12-point jump on CursorBench, the 3x improvement on Rakuten-SWE-Bench, the dramatic gains in visual understanding, and the new effort control features make it a compelling choice for developers, enterprises, and anyone running autonomous AI workflows.
But the most important story may not be Opus 4.7 itself — it’s what it reveals about Anthropic’s approach to building and releasing increasingly powerful AI systems. By positioning Opus 4.7 as a safety testbed for Mythos-class capabilities, the company is attempting something the industry has rarely seen: a staged rollout where each step informs the safety infrastructure of the next.
Whether this cautious, incremental approach can keep pace with competitors who release models more aggressively remains an open question. But for now, Opus 4.7 proves that Anthropic can deliver substantial capability improvements without sacrificing its commitment to responsible deployment — and that’s a balance the AI industry desperately needs.
“Opus 4.7 demonstrates strong substantive accuracy on BigLaw Bench for Harvey, scoring 90.9% at high effort with better reasoning calibration on review tables.”
If the benchmarks and early user feedback hold up at scale, Opus 4.7 may well be the model that convinces more enterprises to move AI agents from pilot projects to production — and that’s a tipping point the entire industry is watching.
📖 Related: Anthropic’s Claude Opus: The Flagship Model That Redefined AI Reasoning
📖 Related: Anthropic Releases Claude Opus 4.7: A New Flagship Model Built for Autonomy
📖 Related: Anthropic Releases Claude Opus 4.7: The Most Capable Public AI Model — With a Catch


