Skip to main content

DeepSeek V4: Claude Code Setup

Rui Dai
Rui Dai Engineer
Share

DeepSeek V4: Claude Code Setup

Claude Code's UX, DeepSeek V4's pricing. That's the promise of this setup — and the reason "deepseek claude code" became a rising search query in May 2026, driven by developers looking to keep the terminal workflow they know while significantly reducing token costs. The setup works. It also has real limits that are worth understanding before you commit to it.

This is not an officially supported configuration by Anthropic or DeepSeek. DeepSeek documents the integration; Anthropic tolerates it. If it breaks, neither company's support team is your first call.

Why Developers Are Running DeepSeek V4 in Claude Code

Why Developers Are Running DeepSeek V4 in Claude Code

The "deepseek claude code" search spike (May 2026, +170% rising query)

The combination makes sense on paper: Claude Code is a mature terminal agent with CLAUDE.md, /ultrareview, task budgets, and MCP support. DeepSeek V4 is a competitive frontier model at a fraction of Anthropic's per-token cost. DeepSeek's Anthropic-compatible API endpoint (/anthropic) was designed exactly for this use case — and when DeepSeek confirmed the integration in their official Claude Code integration guide, developers noticed.

The search spike followed DeepSeek's formal endorsement of the setup in their coding agents integration guide. It's not a hack. It's a documented path that has been functionally available since DeepSeek exposed the /anthropic endpoint.

What you actually get vs native DeepSeek-TUI

Running DeepSeek V4 through Claude Code is different from running DeepSeek-TUI directly. The key differences:

Claude Code + DeepSeek V4DeepSeek-TUI
InterfaceClaude Code terminal UXRust-native ratatui TUI
CLAUDE.md / AGENTS.md✅ Full✅ Reads .claude/skills too
Claude Code Routines
/ultrareview❌ (Anthropic backend only)
Task budgets❌ (Anthropic backend only)
RLM parallel sub-agents✅ (1–16 flash children)
CostDeepSeek V4 ratesDeepSeek V4 rates

If you're already invested in Claude Code's workflow — CLAUDE.md, existing MCP servers, Routines — this setup preserves that investment while switching the model backend. If you want RLM parallel sub-agents or a Rust-native TUI, DeepSeek-TUI is the better fit.

When this setup makes sense and when it doesn't

Makes sense: You're paying for Anthropic's API at full Claude rates and want to reduce costs on high-volume or experimental work. You want to test DeepSeek V4's capability in your existing Claude Code workflow before committing to a different tool. You have CLAUDE.md files and MCP servers you don't want to migrate.

Doesn't make sense: You're on a Claude Code subscription (Pro/Max) rather than API key billing — the subscription doesn't route through this env var setup. You need /ultrareview, task budgets, or any Anthropic-native feature. You're working with sensitive or proprietary code (read the security section below).

Before You Start: Requirements and What This Can't Do

Claude Code version and OS support

This setup requires the Claude Code CLI — terminal and VS Code surfaces only. Claude Desktop does not support this configuration: Desktop uses OAuth, not API keys, and does not read ANTHROPIC_AUTH_TOKEN or ANTHROPIC_BASE_URL. Set this up in the terminal or via VS Code extension.

Install or verify Claude Code:

# Install via native installer (recommended — npm install is deprecated):
# macOS / Linux:
curl -fsSL https://claude.ai/install.sh | sh
# Or: brew install claude-code

# Verify:
claude --version

Node.js 18+ is required if you use the npm path. The native installer has no Node.js dependency.

DeepSeek API key from platform.deepseek.com

Sign up at platform.deepseek.com and create an API key. DeepSeek has offered new-user credits at signup — verify current credit availability at the platform; this changes. Your key will look like sk-xxxxxxxxxxxxxxxx.

What native Claude Code features still work, what doesn't

Still works: CLAUDE.md, AGENTS.md, MCP servers (stdio transport), claude slash commands that don't require Anthropic's backend, the approval gate system, session history and /resume, Routines (the scheduling mechanism), and most of the Claude Code CLI surface.

Does not work: /ultrareview (requires Anthropic's backend), task budgets (the task-budgets-2026-03-13 beta header is Anthropic-specific), xhigh reasoning effort (Anthropic feature), Claude Code's Skill ecosystem's Anthropic-signed skills, and any feature that makes a call to Anthropic's infrastructure directly.

Step-by-Step Setup

Step 1 — Install or update Claude Code

Install or update Claude Code
# macOS/Linux native:
curl -fsSL https://claude.ai/install.sh | sh

# Verify installation:
claude --version

If you're on an older npm install and seeing issues, Anthropic's docs now call npm installation deprecated. Migrate to native install.

Step 2 — Generate a DeepSeek API key

Generate a DeepSeek API key
  1. Go to platform.deepseek.com
  2. Navigate to API KeysCreate new key
  3. Copy the key — it's shown only once
  4. Add billing credit if needed (new user credits may cover initial testing)

Step 3 — Set the environment variables

Linux / macOS (shell):

export ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
export ANTHROPIC_AUTH_TOKEN=<your DeepSeek API Key>
export ANTHROPIC_API_KEY=<your DeepSeek API Key>   # some Claude Code versions check both
export ANTHROPIC_MODEL=deepseek-v4-pro[1m]
export ANTHROPIC_DEFAULT_OPUS_MODEL=deepseek-v4-pro[1m]
export ANTHROPIC_DEFAULT_SONNET_MODEL=deepseek-v4-pro[1m]
export ANTHROPIC_DEFAULT_HAIKU_MODEL=deepseek-v4-flash
export CLAUDE_CODE_SUBAGENT_MODEL=deepseek-v4-flash
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
export CLAUDE_CODE_EFFORT_LEVEL=max

Windows (PowerShell):

$env:ANTHROPIC_BASE_URL="https://api.deepseek.com/anthropic"
$env:ANTHROPIC_AUTH_TOKEN="<your DeepSeek API Key>"
$env:ANTHROPIC_API_KEY="<your DeepSeek API Key>"
$env:ANTHROPIC_MODEL="deepseek-v4-pro[1m]"
$env:ANTHROPIC_DEFAULT_OPUS_MODEL="deepseek-v4-pro[1m]"
$env:ANTHROPIC_DEFAULT_SONNET_MODEL="deepseek-v4-pro[1m]"
$env:ANTHROPIC_DEFAULT_HAIKU_MODEL="deepseek-v4-flash"
$env:CLAUDE_CODE_SUBAGENT_MODEL="deepseek-v4-flash"
$env:CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC="1"
$env:CLAUDE_CODE_EFFORT_LEVEL="max"

Via settings.json (Claude Code config file, persistent):

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.deepseek.com/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your DeepSeek API Key>",
    "ANTHROPIC_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_SUBAGENT_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1",
    "CLAUDE_CODE_EFFORT_LEVEL": "max"
  }
}

The settings.json path is ~/.claude/settings.json (global) or .claude/settings.json in your project directory. The full reference config is also in deepseek-ai/awesome-deepseek-agent.

Step 4 — Pick a model: deepseek-v4-pro vs deepseek-v4-flash

deepseek-v4-pro[1m] — the flagship. 1.6T total parameters, 49B active, 1M context window. Use for complex agent tasks, multi-file refactors, reasoning-intensive work. More expensive per token.

deepseek-v4-flash — the efficient tier. 284B total parameters, 13B active, 1M context window. Use for subagent roles, simpler edits, exploratory work. Significantly cheaper per token.

The [1m] suffix on the model ID explicitly requests the 1M context window. Without it, the default context may be shorter.

Step 5 — Verify the connection with a small task

Verify the connection with a small task
claude

Inside the session, run:

/status

This shows the current base URL and model. If you see api.deepseek.com/anthropic in the base URL field, the environment variables are loaded. Run a small read-only task to confirm model responses are arriving:

Summarize the purpose of this file: README.md

If you get a response, the integration is working.

Configuring 1M Context and Thinking Modes

When 1M context actually pays off

The 1M context window is real — and so is the token cost of filling it. At 1M tokens of input, you're paying for 1M tokens. Loading an entire large codebase into a single context is architecturally possible but may cost more per session than your baseline estimate.

Use the 1M window for tasks where it genuinely helps: multi-file refactors that need to read many files in one pass, long test-fix sessions where accumulated tool call history matters, or codebase analysis that requires understanding relationships across many files. For short focused tasks, don't pre-load context you don't need.

Reasoning effort: max, high, off

Setting CLAUDE_CODE_EFFORT_LEVEL=max tells Claude Code to request maximum reasoning depth from the model. Through DeepSeek's /anthropic endpoint, this maps to DeepSeek V4's thinking mode. The impact: more thorough output on complex tasks, and higher token consumption on output (thinking tokens are billed).

For tasks where reasoning depth matters less — quick edits, simple lookups, formatting — consider dropping to high or omitting the effort level to reduce thinking token consumption.

Token cost implications

Token cost implications

Costs depend on current DeepSeek pricing (check platform.deepseek.com/api-docs/pricing at time of use — rates have changed and may change again). Two things that drive cost in this setup:

  1. Context size: longer context = more input tokens per request
  2. Thinking mode: max effort generates extended reasoning traces billed as output tokens

Context caching works through DeepSeek's /anthropic endpoint — repeated stable content (system prompt, unchanged files) is cached at DeepSeek's end and charged at the cached input rate (~10% of full input rate).

Common Errors and Fixes

"API streaming failed" or connection errors

API Error: streaming failed

Check 1: Verify ANTHROPIC_BASE_URL is set to exactly https://api.deepseek.com/anthropic — no trailing slash, no /v1.

Check 2: Network connectivity to api.deepseek.com. If you're behind a corporate proxy, set HTTPS_PROXY before the env vars.

Check 3: Add CLAUDE_CODE_DISABLE_NONSTREAMING_FALLBACK=1 — this prevents Claude Code from falling back to a non-streaming path that may conflict with DeepSeek's endpoint.

Authentication failures

API Error: 401 Unauthorized

Common cause: Setting ANTHROPIC_API_KEY but not ANTHROPIC_AUTH_TOKEN. Some Claude Code versions check one, some check both. Set both to your DeepSeek key:

export ANTHROPIC_AUTH_TOKEN=<your DeepSeek key>
export ANTHROPIC_API_KEY=<your DeepSeek key>

Also verify the key is valid: test it against DeepSeek's API directly with a curl call.

Model not found / wrong model ID

API Error: 404 — model not found

Common cause: Using a legacy model ID. The correct IDs as of V4 are:

  • deepseek-v4-pro[1m] — not deepseek-chat, not deepseek-v4-0424
  • deepseek-v4-flash — not deepseek-reasoner

Legacy aliases deepseek-chat and deepseek-reasoner retire after July 24, 2026. Update model IDs now.

Cost and Trade-offs vs Anthropic-Native Claude Code

Token pricing comparison

As of May 2026, DeepSeek V4 Pro is running at a promotional discount for a limited period. At standard rates, V4 Pro input tokens cost a fraction of what Claude Opus 4.7 input tokens cost — the ratio has been quoted by multiple sources as roughly 10–15× cheaper per million input tokens. V4 Flash is cheaper still.

These are the rates at time of writing; verify at DeepSeek's pricing page before relying on them for budget planning. The promotional rate expires May 31, 2026 at 15:59 UTC.

Capability gaps to expect

What works well: Multi-file code navigation, implementation tasks, debugging sessions, most Claude Code workflow mechanics.

What's weaker vs Claude Opus 4.7: Complex architectural reasoning, instruction following consistency on long sessions, and edge cases in Claude Code's tool loop behavior (some tool calls that Claude handles gracefully may produce errors or unexpected outputs through the proxy endpoint).

What's absent entirely: /ultrareview, task budgets, xhigh effort, Anthropic-native Skill signing, and any Claude Code feature that relies on Anthropic's backend infrastructure rather than the model API.

Security: What Happens to Your Code

This section is not optional reading. When you use this setup, your code, prompts, and tool call outputs transit through DeepSeek's API infrastructure — not Anthropic's.

For open-source, personal, or low-sensitivity code: The risk profile is similar to using any cloud model API. Standard API terms and data handling policies apply.

For enterprise or proprietary code: Your code passes through DeepSeek's servers. DeepSeek is a Chinese company operating under Chinese law. Depending on your organization's data handling requirements, this may violate data residency policies, employment agreements, or client confidentiality obligations. Before using this setup on a professional project, check with your legal or compliance team.

For any production code containing credentials, private keys, or PII: Do not use this setup. Even accidentally including such data in a prompt or a file the agent reads sends it outside Anthropic's data handling agreements.

This is not a criticism of DeepSeek's security posture — it's a structural fact about what happens when you route your code to any non-Anthropic endpoint.

FAQ

Is this an officially supported setup?

No. DeepSeek documents it; Anthropic tolerates it. Neither company provides support for issues that arise from this configuration. If Claude Code updates break the integration, you're waiting for either DeepSeek to update their compatibility layer or for community-documented workarounds.

Do I lose Claude Code skills / CLAUDE.md?

You keep CLAUDE.md and AGENTS.md — they're read at session start regardless of model backend. You lose Anthropic-specific features: skill signing, /ultrareview, task budgets, and the Routines features that call back to Anthropic's infrastructure. MCP servers configured in your Claude Code settings continue to work.

Can I switch back to Anthropic models per session?

Yes. Unset the environment variables or launch a new session without the DeepSeek env vars. If you used a settings.json file, rename or move it. Claude Code will fall back to its default Anthropic model for that session.

Is it safe to send proprietary code to DeepSeek's API?

It depends on your definition of "safe" and your organization's requirements. Read the Security section above. For personal or open-source code: generally fine. For confidential enterprise code: verify with your compliance team before using.

Does context caching work through this proxy?

Yes. DeepSeek's /anthropic endpoint supports prompt caching. Repeated content at the beginning of your context (system prompt, stable file content) is cached and billed at DeepSeek's cached input rate — approximately 10% of the full input token rate.

Conclusion

Commit to this setup when: you're paying API rates for high-volume Claude Code usage and want to reduce costs on tasks where DeepSeek V4-Pro's quality is sufficient, you have existing CLAUDE.md and MCP infrastructure you don't want to rebuild, and you're working on code that's appropriate to route through DeepSeek's API.

Stick with Anthropic-native when: you need /ultrareview, task budgets, or xhigh reasoning effort, you need Anthropic's data handling guarantees for your code, or you're a Claude subscription user (Pro/Max) where the integration doesn't apply to your billing model.

Use DeepSeek-TUI directly when: you want RLM parallel sub-agents, native DeepSeek V4 tooling, or the Rust-native TUI without Claude Code's overhead.

Related Reading

Rui Dai
Written byRui Dai Engineer

Hey there! I’m an engineer with experience testing, researching, and evaluating AI tools. I design experiments to assess AI model performance, benchmark large language models, and analyze multi-agent systems in real-world workflows. I’m skilled at capturing first-hand AI insights and applying them through hands-on research and experimentation, dedicated to exploring practical applications of cutting-edge AI.