Pular para o conteúdo principal

Claude Sonnet 4

Claude Sonnet 4
Everything you need to know about Claude Sonnet 4 — Deep Think mode, SWE-bench 78%, coding benchmarks, and how to use it inside Verdent for solo builders.

Claude Sonnet 4 launched in May 2025 as the balanced model in Anthropic's original Claude 4 family.

It is now deprecated on Anthropic's first-party API, with retirement scheduled for June 15, 2026. Teams still using it should identify where it appears in prompts, SDK calls, eval scripts, routing rules, and production workflows.

Verdent's built-in model lineup has already moved beyond Claude Sonnet 4. In Verdent, Manager can turn migration into a controlled workflow: plan the change, assign updates in isolated worktrees, compare test results, and review the final diff prior to merging.

The goal is not just to swap a model name. It is to preserve expected behavior while moving to a supported Claude option with clear testing, verification, and rollback paths.

Claude Sonnet 4 Overview

Claude Sonnet 4 was the balanced model in the original Claude 4 family. It targeted everyday coding, tool use, vision tasks, and multi-step reasoning without the higher cost profile of Opus 4.

Its API model ID is:

claude-sonnet-4-20250514

It supported a 200K-token context window, extended thinking, vision input, and tool-based workflows. Anthropic recommends Sonnet 4.6 for migration from Sonnet 4.

Treat Claude Sonnet 4 as a legacy production dependency rather than a model to choose for new work. If it appears in config files, SDK calls, eval scripts, agent routing, or prompt templates, document four details before replacing it: the exact model ID, the task type, the expected output format, and any prompts that rely on Sonnet 4-specific behavior.

A safe migration starts with inventory. Search for the model ID, check environment variables, review provider-specific model aliases, and identify any tests that measure output quality. Then run the replacement model against the same prompts and compare correctness, latency, cost, and failure modes.

Deep Think Mode Explained

Anthropic did not call this feature "Deep Think." The official term was extended thinking.

Extended thinking let Claude spend additional tokens on internal reasoning before producing the final answer. It helped when a task required planning across constraints, files, tools, or tradeoffs. It also increased output-token cost and latency.

Use extended thinking for:

  • Architecture decisions that require comparing tradeoffs.
  • Difficult debugging where the cause is not obvious.
  • Multi-step migrations across services or packages.
  • Complex tool planning where the model must decide what to inspect before editing.
  • Refactors where tests, interfaces, and generated files need coordinated changes.

Keep it off for simple edits, short explanations, one-file changes, and narrow formatting tasks. In those cases, extended thinking can add cost and delay without materially improving the final patch.

For migration work, test extended thinking separately from the base model choice. A newer Claude model without extended thinking may outperform Sonnet 4 on routine work, while extended thinking may still be useful for hard planning tasks.

Coding & SWE-bench Performance

Claude Sonnet 4 performed strongly on software engineering evaluations, especially compared with earlier Sonnet models. Its practical strength was longer coding loops: reading context, using tools, editing files, responding to test failures, and continuing through multi-step changes.

There was no universal "78%" Claude Sonnet 4 score across every SWE-bench setup. SWE-bench results vary by benchmark version, harness design, tool access, sampling strategy, retry policy, and whether parallel test-time compute is allowed. A score from one agent system should not be treated as the raw capability of the base model in every environment.

For developers, the important question is not only the headline benchmark. Ask whether the model can preserve behavior, follow repository conventions, run the right tests, and produce reviewable diffs. A lower-latency model with better workflow controls can be more useful than a stronger raw model used without planning or verification.

When replacing Claude Sonnet 4, run a small internal eval before changing production routing. Use real tasks from your codebase, include both successful and failed historical tickets, and score each candidate on compile success, test pass rate, diff size, instruction following, and human review effort.

Before changing production routing, compare the same repository tasks against Claude Sonnet 4.5 to see whether its coding loop improves test pass rate and review effort.

For source-level validation, Anthropic documentation is worth checking after you understand the Claude Sonnet 4 workflow described here.

Claude Sonnet 4 vs Opus 4

AreaSonnet 4Opus 4
Role in Claude 4 familyBalanced daily modelHigher-reasoning flagship model
SpeedFasterSlower
Historical standard price$3 input / $15 output per 1M tokens$15 input / $75 output per 1M tokens
Reasoning depthStrongHigher
Best fitDaily coding, tool use, routine agent workHard research, architecture, complex analysis
Current statusDeprecatedDeprecated

Both models are scheduled to retire on June 15, 2026, on Anthropic's first-party API.

Sonnet 4 was usually the better fit for high-volume coding workflows because it balanced capability, speed, and cost. Opus 4 was better for the hardest planning and analysis tasks, but its higher price made it less suitable for every agent step.

For new work, do not choose between these deprecated models. Choose a supported replacement, then decide how to route tasks by difficulty. Routine implementation can use a faster model, while difficult architecture, debugging, or review tasks can use a stronger model inside a controlled workflow.

A useful historical baseline for this comparison is Claude 3.5 Sonnet, since it shows how Sonnet’s balance of speed, cost, and coding strength evolved before Claude 4.

When details such as limits or setup steps matter, Anthropic documentation can help confirm the latest implementation surface.

Context Window & API Pricing

Claude Sonnet 4 used a 200K-token context window. That window made it useful for large prompts, long files, repository summaries, product specifications, and multi-document review.

Historical standard pricing was:

Token typePrice per 1M tokens
Input$3
Output$15

The 200K-token window did not remove the need for context management. Large prompts can increase latency, raise cost, and bury the most important instructions. For coding agents, better results usually come from focused context: the relevant files, the failing test output, the goal, the constraints, and the expected review criteria.

Do not start a new production integration on Claude Sonnet 4. Migrate before retirement, and confirm whether your provider follows Anthropic's first-party lifecycle date or a separate partner schedule.

Verdent's quality signal is 76.1% on SWE-bench Verified. Verdent earns that result through a delivery system, not a single unreviewed model response. Parallel Power increases throughput, while Production-Ready Quality keeps the result out of Quality Roulette.

To understand what changed in context handling and coding behavior, compare Sonnet 4 against Claude 3.7 Sonnet before estimating migration cost.

Before you budget a real project around Claude Sonnet 4, compare the claims here with Openrouter.

Using Claude Sonnet 4 in Verdent

Verdent now lists newer Claude models. The current built-in options include Claude Fable 5 and Opus 4.8. Verdent also supports BYOK and BYOA for teams that need their own keys, accounts, or provider relationships.

Use Verdent when the task needs more than one model response. Plan Mode structures the work before edits begin. Parallel agents execute separate tasks in isolated worktrees. Verification checks the code before the result reaches the main branch.

A practical Claude Sonnet 4 migration in Verdent follows this workflow:

  1. Inventory every place Claude Sonnet 4 appears, including model IDs, aliases, prompts, eval scripts, and agent routing rules.
  2. Use Manager to split the migration into safe units, such as SDK updates, prompt adjustments, benchmark reruns, and production config changes.
  3. Run candidate replacements in separate worktrees so each model or prompt strategy produces an isolated diff.
  4. Compare outputs using tests, logs, cost estimates, latency, and human review notes.
  5. Merge only the version that preserves behavior and improves maintainability.

This workflow is safer than a blind model rename. It treats the model as one part of the delivery system: planning, execution, verification, and review all matter.

Frequently Asked Questions

Is Claude Sonnet 4 deprecated?

Yes. Claude Sonnet 4 is deprecated on Anthropic's first-party API. Treat it as a legacy dependency and avoid starting new production integrations with it.

When does Claude Sonnet 4 retire?

Claude Sonnet 4 retires on June 15, 2026, on Anthropic's first-party API. Teams using it in production should migrate before that date and verify whether any cloud partner has a different lifecycle schedule.

Did it have a feature called Deep Think?

No. Anthropic called the feature extended thinking. Extended thinking let the model spend additional reasoning tokens before the final answer, which helped with difficult planning but increased cost and latency.

What replaces Claude Sonnet 4?

Anthropic recommends Claude Sonnet 4.6 as the migration path from Claude Sonnet 4. In Verdent, teams can also use newer built-in Claude options such as Claude Fable 5 and Opus 4.8, depending on task difficulty and workflow needs.

Can I still use it through a cloud partner?

Possibly. Cloud partner lifecycle dates may differ from Anthropic's first-party API schedule. Check the provider's current model table, confirm the exact model ID, and test a supported replacement before the retirement window closes.

Migrate Before You Need To

Claude Sonnet 4 retires June 15, 2026. Existing integrations still need a supported replacement, even if they work today.

Use the migration to improve the workflow, not just the model name. Inventory usage, test replacement models, compare diffs, and keep review gates in place. The model changes, but the plan, workspaces, verification, and review path stay stable.

Next Step

Prepare Your Claude Sonnet 4 Migration

Claude Sonnet 4 retires June 15, 2026. Move your workflows to a supported model while keeping your plan, workspaces, and review process stable.