メインコンテンツへスキップ

Claude Opus 4

Claude Opus 4
Full guide to Claude Opus 4 — what makes it different from Sonnet, when to use it, pricing breakdown, and how Verdent uses Opus 4 for multi-agent deep research tasks.

Claude Opus 4 was Anthropic's top model in the original Claude 4 family. It is now deprecated, and Anthropic will retire it on June 15, 2026.

For teams that still see claude opus 4 in configs, routing rules, eval scripts, or vendor layers, the important question is how to handle remaining use cases without letting a legacy model quietly shape production workflows.

In Verdent, Claude Opus 4 can be used inside Plan Mode and parallel worker dispatch, where its reasoning is applied to deep research, complex code analysis, and high-value planning tasks. Verdent adds the surrounding workflow: task breakdown, delegation, verification, review controls, and handoff into inspected engineering work.

That makes Opus 4 useful when the job justifies an expensive reasoning model, especially for multi-step investigation or migration planning. Routine edits, simple summaries, and low-risk implementation tasks usually belong on cheaper or newer models.

What Is Claude Opus 4

Anthropic released Claude Opus 4 in May 2025 as the highest-capability model in the original Claude 4 family.

Its API model ID is claude-opus-4-20250514.

Opus 4 targeted deep reasoning, coding, and long-running agent work. It was suited to tasks that required sustained context, careful planning, and multiple reasoning steps across a large codebase.

Opus 4 is now a legacy model identifier. Anthropic has marked it for retirement on June 15, 2026, and Opus 4.8 is the current recommended replacement.

Teams should treat Opus 4 references as migration items. Check application configuration, evaluation scripts, model routing rules, agent prompts, CI helpers, vendor abstraction layers, and internal documentation for hard-coded references before the retirement date.

Opus 4 vs Sonnet 4: Key Differences

AreaOpus 4Sonnet 4
CapabilityHigherStrong
SpeedLowerHigher
Input price$15 per 1M tokens$3 per 1M tokens
Output price$75 per 1M tokens$15 per 1M tokens
Best fitHard problems, deep planning, complex reviewDaily coding, routine edits, faster iteration

Both models used 200K-token context windows. Both are deprecated.

The practical difference was not only raw capability. Opus 4 made more sense when the cost of a bad answer was high: migration planning, architecture review, security-sensitive changes, or debugging failures that crossed service boundaries. Sonnet 4 made more sense for frequent coding work where speed and cost mattered more than maximum reasoning depth.

For new work, neither model should be the default target. Existing Sonnet 4 and Opus 4 usage should be mapped to supported replacements, then tested against the same tasks, repositories, and review gates used in production.

Real-World Coding Use Cases for Opus

Opus 4 fit tasks where a wrong decision caused expensive rework.

Examples included:

  • Designing a service migration.
  • Debugging distributed failures.
  • Reviewing security-sensitive changes.
  • Planning a large refactor.
  • Coordinating long research tasks.

Routine edits usually did not justify the price.

The best Opus-style workloads are bounded but difficult. They often require the model to read a broad design surface, compare tradeoffs, identify risky edge cases, and produce a plan that a human or worker agent can verify.

Good examples include decomposing a monolith migration, tracing a production bug through logs and code paths, reviewing authentication or authorization changes, planning a database schema transition, or comparing multiple implementation strategies before work begins.

Poor fits include formatting, small copy edits, simple CRUD changes, dependency bumps, test snapshot updates, and repetitive refactors with a narrow pattern. Those tasks usually benefit from lower-cost models, deterministic tooling, or parallel workers with clear review checks.

In cases where the same high-stakes coding workload may benefit from newer behavior, Claude Opus 4.1 is the closest comparison point.

For source-level validation, Anthropic documentation is worth checking after you understand the Claude Opus 4 workflow described here.

Token Pricing & Cost Breakdown

Historical standard pricing was:

Token typePrice per 1M tokens
Input$15
Output$75
Cache hit$1.50

Extended thinking tokens counted as output.

Opus 4.8 now costs $5 input and $25 output per million tokens. It is the stronger migration target.

Real workflow cost depends on more than the first prompt. Long agent loops can multiply output tokens through plans, intermediate reasoning, tool calls, retries, explanations, and final review notes. A task that looks small in input size can become expensive if the model repeatedly explores the repository or rewrites the same patch.

The safer pattern is to reserve expensive reasoning for specific phases. Use it for planning, risk analysis, architecture review, or final validation. Use lower-cost models or deterministic tools for routine implementation, formatting, search, and mechanical edits.

Teams should track cost by task outcome, not only by token price. Useful measures include total tokens, accepted changes, tests passed, review comments resolved, retries required, and human correction time.

Routine implementation phases often pair better with Claude Haiku 4.5, especially when speed and lower output cost matter more than maximum reasoning depth.

When details such as limits or setup steps matter, Anthropic documentation can help confirm the latest implementation surface.

Claude Opus 4 vs GPT-5

Model quality depends on the workflow.

Opus 4 was strong in deep reasoning and long agent loops. GPT-5 offered a different tool and reasoning stack.

Compare complete task outcomes:

  • Tests passed.
  • Human fixes required.
  • Time to completion.
  • Total token cost.
  • Tool failures.

A model comparison should use the same repository, task brief, tool access, review criteria, and merge requirements. Otherwise, the result can reflect the surrounding workflow more than the model itself.

Opus 4's pending retirement makes it a poor choice for new deployments. Even if an existing integration still works, new model routing should favor supported models with a clear migration path.

> A repository-level proof point > > A model can sound certain and still fail the repository. Verdent's 76.1% SWE-bench Verified result is evidence for testing the change, not trusting the tone. > > The point is not more agent activity. It is less Blind AI, less Code Chaos, and a result the team can inspect.

Use Claude 4 to understand the broader model family before deciding whether Opus 4 still belongs in an existing workflow.

Before you budget a real project around Claude Opus 4, compare the claims here with Platform.

Opus 4 in Verdent Multi-Agent Workflows

Verdent now uses newer models.

Claude Fable 5 and Opus 4.8 are listed in current plans. They can handle the difficult reasoning role once assigned to Opus 4.

Verdent prevents one expensive model from doing every step. Parallel agents can divide research, implementation, and verification.

In a Verdent workflow, a high-reasoning model is most useful when it defines the plan, identifies risk, or reviews complex decisions. Worker agents can then handle scoped implementation tasks, gather repository context, update tests, and report results back through the plan.

This keeps the expensive model focused on judgment instead of volume. It also gives the team clearer checkpoints: what changed, why it changed, which files were touched, which tests ran, and what still needs human review.

For teams migrating away from Opus 4, the important step is not only swapping a model name. The replacement should be tested inside the same plan-first, review-before-merge process that will govern production work.

Frequently Asked Questions

Is Claude Opus 4 retired?

Not yet. Anthropic will retire Claude Opus 4 on June 15, 2026. Existing integrations should be reviewed before that date so production systems do not depend on an unsupported model.

What replaces Opus 4?

Anthropic recommends Opus 4.8 as the replacement. Teams should test the replacement against their own coding, planning, review, and agent workflows before changing production routing.

Was Opus 4 more expensive than Sonnet 4?

Yes. Historical standard pricing listed Opus 4 at $15 per 1M input tokens and $75 per 1M output tokens, while Sonnet 4 was $3 per 1M input tokens and $15 per 1M output tokens.

Did Opus 4 support extended thinking?

Yes. Extended thinking was supported, and those tokens counted as output tokens. That matters for cost planning because long reasoning loops can increase total spend.

Does Verdent list Opus 4.8?

Yes. Verdent lists Opus 4.8 in current plans and uses newer models for the difficult reasoning roles that Opus 4 previously handled.

Migrate Before You Need To

Claude Opus 4 retires June 15, 2026. Existing integrations still need a supported replacement.

Before migration, teams should find hard-coded model IDs, check routing rules, update evaluation scripts, and test the replacement on real repository tasks.

The replacement can be tested inside the same plan-first, review-before-merge process, so the team can compare task quality, cost, latency, and human review effort before production use.

Next Step

Replace Claude Opus 4 Before Retirement

Claude Opus 4 retires on June 15, 2026. Test a supported replacement now while keeping Verdent’s plan-first workflow and review gate in place.