リソース

コミュニティ

Claude 4

Full breakdown of the Claude 4 model family — Opus 4, Sonnet 4, context windows, benchmark scores, and what changed from Claude 3 to Claude 4 for developers.

Claude 4 was Anthropic's May 2025 model family for coding, reasoning, tool use, and long-running agent workflows.

The original lineup included Claude Sonnet 4 and Claude Opus 4. Both models are now deprecated and scheduled to retire on June 15, 2026, so teams still using Claude 4 should plan migrations to newer Claude models such as Sonnet 4.6 or Opus 4.8.

For developers, Claude 4 matters as a transition point: it improved agentic coding and extended-context work, but production use depends on model IDs, latency, cost, evaluation suites, repository context, and safe rollout plans.

In Verdent, Claude models run within Plan Mode, parallel worker dispatch, workspace safety controls, and code verification. The model supplies reasoning, while Verdent structures that reasoning into scoped tasks, reviewed changes, and a verified result.

Start Free With Verdent AI

Claude 4 Model Family Overview

Claude 4 targeted coding, reasoning, tool use, and long-running agent tasks.

The original family had two main models:

Model	Role	Historical price
Claude Sonnet 4	Balanced coding and agent model	$3 input / $15 output
Claude Opus 4	Highest-capability reasoning and research model	$15 input / $75 output

Prices are per million tokens. The original Sonnet 4 and Opus 4 models are deprecated and scheduled to retire on June 15, 2026.

Claude Sonnet 4 was the practical default for daily engineering work because it balanced speed, capability, and cost. Claude Opus 4 fit harder tasks where the model needed deeper reasoning, more careful tradeoff analysis, or stronger synthesis across a large codebase.

Teams maintaining older Claude 4 integrations should treat the retirement date as a migration deadline. Start by finding every configured Sonnet 4 or Opus 4 model ID, then map each usage to the workflow it supports: code review, test generation, bug fixing, research, architecture planning, or automation. After that, test the same workflows against the recommended newer Claude models before changing production defaults.

What Changed from Claude 3 to Claude 4

Claude 4 improved agentic execution compared with Claude 3.

The main changes were:

Better long-horizon coding.
Stronger tool use.
Extended thinking.
Improved memory workflows.
More reliable multi-file changes.

The shift was from answering coding questions to completing longer coding tasks. Claude 3 was useful for explanations, snippets, and isolated reasoning. Claude 4 moved closer to agent workflows where the model had to inspect files, plan edits, use tools, preserve context, and complete a task across several steps.

That mattered for repository maintenance. A coding agent must understand the request, identify relevant files, avoid unrelated edits, modify code, run or recommend tests, and explain the result. Claude 4 made those workflows more realistic because it handled state, tools, and multi-file reasoning more reliably than earlier Claude 3 models.

For buyers and engineering leads, the practical change was workflow design. Claude 4 performed best when paired with clear task boundaries, scoped repository context, explicit acceptance criteria, and verification. Without those controls, a stronger model could still waste tokens, over-edit files, or miss project-specific constraints.

Opus 4 vs Sonnet 4 Comparison

Sonnet 4 was faster and cheaper. Opus 4 was designed for harder reasoning, deeper research, and more complex planning.

Decision	Better historical fit
Daily coding	Sonnet 4
High-volume tasks	Sonnet 4
Test generation	Sonnet 4
Routine bug fixes	Sonnet 4
Complex architecture	Opus 4
Deep research	Opus 4
Ambiguous product decisions	Opus 4
High-stakes refactors	Opus 4

The simplest selection rule was cost versus reasoning depth. Use Sonnet 4 when the task was well-scoped, repetitive, or easy to verify. Use Opus 4 when the task required slower analysis, larger tradeoffs, or synthesis across many files and requirements.

For production systems, many teams used a tiered pattern: route planning, architecture, or difficult investigations to Opus 4, then route implementation, test updates, and routine follow-up work to Sonnet 4. That pattern reduced cost while keeping the highest-capability model available for the hardest decisions.

For current work, compare Sonnet 4.6 and Opus 4.8 instead. The same decision logic still applies: match the model to the task, then measure output quality, latency, retries, human review time, and total cost.

The distinction becomes clearer when comparing Sonnet 4 against Claude Opus 4 for slower planning, architecture tradeoffs, and high-stakes refactors.

For source-level validation, Claude is worth checking after you understand the Claude 4 workflow described here.

Context Window & Token Limits

The original Claude 4 models used 200K-token context windows.

A 200K-token context window helped with broad repository analysis, long documents, specifications, logs, and multi-file debugging. It allowed the model to see more material in one request, which reduced the need to constantly re-send missing background.

Large context did not remove the need for scope control. A model can still miss the important file if the prompt includes too much unrelated code. Large prompts also increase cost, latency, and distraction. The best workflow is to pass the smallest complete context that lets the model solve the task.

For development work, that usually means:

Start with the user request and acceptance criteria.
Add the relevant files, tests, and error output.
Include project conventions only when they affect the change.
Keep unrelated files out of the prompt.
Verify the result with tests, review, or static checks.

Verdent follows this principle by using planning and worker boundaries around the model. Plan Mode narrows the task before execution. Parallel workers can investigate or implement separate parts of the work without turning the entire repository into one unstructured chat prompt.

Current Claude models can offer larger or different context limits. Always check the active model table before setting production assumptions.

For latency-sensitive tasks where a smaller context is enough, Claude Haiku may be a better fit than sending large prompts to a higher-capacity model.

When details such as limits or setup steps matter, Youtube can help confirm the latest implementation surface.

Claude 4 Benchmarks vs GPT-5

Claude 4 and GPT-5 belong to different release periods, so direct benchmark tables can be misleading.

Public scores often mix different harnesses, prompts, tool permissions, retry policies, and token budgets. A model that performs well in one benchmark may behave differently inside a real repository with project conventions, failing tests, dependency constraints, and human review requirements.

Compare models with the same task instead:

Use one repository.
Give each model the same issue or feature request.
Set the same time limit.
Provide the same tools and file access.
Require the same tests or review checks.
Count retries, failed patches, and human fixes.
Measure total cost, not only token price.

The most useful benchmark for a development team is an end-to-end workflow score. Did the model understand the task? Did it edit the right files? Did it avoid unrelated changes? Did tests pass? Did the reviewer trust the diff?

> Production evidence > > 76.1% on SWE-bench Verified is Verdent's credibility anchor. Plan-First execution and review turn model capability into a result a team can inspect. > > Enterprise-Grade Safety controls the workspace. Code Verification controls the result.

For buyers, the takeaway is simple: benchmark the system, not only the base model. Claude 4, GPT-5, and newer models become more useful when they run inside a workflow that controls planning, execution, verification, and review.

Teams comparing Claude 4 variants should include Claude Opus 4.1 in the same workflow test before choosing a model for complex coding work.

Before you budget a real project around Claude 4, compare the claims here with Anthropic documentation.

Getting Claude 4 in Verdent

Verdent's current built-in model list uses newer models rather than the deprecated original Claude 4 models.

The list includes Claude Fable 5 and Opus 4.8. It also includes current models from OpenAI, Google, GLM, and Kimi.

Verdent adds an execution layer around those models. Instead of treating Claude as a single chat session, Verdent uses planning, parallel workers, safety controls, and verification to manage software work.

That matters during a Claude 4 migration. If a workflow is already defined around tasks, plans, worker boundaries, and tests, changing the model ID is less disruptive. The engineering process stays the same while the underlying model changes.

A practical migration in Verdent should follow this sequence:

Identify workflows that still depend on Claude Sonnet 4 or Claude Opus 4.
Select the newer Claude model that matches each workflow.
Run the same task through the old and new model where possible.
Compare diff quality, test results, review effort, latency, and cost.
Move production defaults after the newer model passes the workflow checks.

Verdent is designed for that kind of controlled model upgrade. The model can change while the planning, dispatch, safety, and verification process remains consistent.

Frequently Asked Questions

Is Claude 4 still active?

The original Claude Sonnet 4 and Claude Opus 4 models are deprecated. Teams should avoid starting new production workflows on those model IDs and should plan migrations to newer Claude models.

When do they retire?

Claude Sonnet 4 and Claude Opus 4 are scheduled to retire on June 15, 2026, on Anthropic's first-party API. Existing integrations should be tested against replacement models before that date.

What replaces them?

Anthropic recommends newer Claude models such as Sonnet 4.6 and Opus 4.8. The right replacement depends on the workflow: Sonnet-class models usually fit daily coding and high-volume tasks, while Opus-class models fit deeper reasoning and complex architecture work.

Did Claude 4 support extended thinking?

Yes. Claude 4 supported extended thinking, which helped with multi-step reasoning, tool use, and longer coding tasks. Extended thinking was most useful when the task included clear goals, relevant context, and a verification step.

Can Verdent use newer Claude models?

Yes. Verdent can use newer Claude models inside its planning, worker dispatch, safety, and verification workflow. That lets teams upgrade the model while keeping the operating process stable.

Migrate Before You Need To

Claude Sonnet 4 and Claude Opus 4 retire June 15, 2026. The code written against them does not disappear, but any workflow that depends on those model IDs needs a replacement path.

A good migration changes the model ID without forcing a new engineering process. Keep the same task definitions, acceptance criteria, tests, and review checks. Then compare the newer model on real work before switching production defaults.

Next Step

Plan Your Claude 4 Migration Now

Claude Sonnet 4 and Opus 4 retire on June 15, 2026. Move your workflows to current Claude models while keeping your operating process intact.

Move to Sonnet 4.6 or Opus 4.8 See the Current Model Lineup