DeepSeek

A Tier · 8.0/10

DeepSeek V4 shipped 2026-04-24: V4-Pro (1.6T/49B active MoE) + V4-Flash (284B/13B active), 1M native context, Hybrid Attention Architecture, open-source on HF. Trails only Gemini 3.1 Pro on world knowledge

Last updated: 2026-04-28Free tier available

Score Breakdown

7.5

Ease of Use

8.0

Output Quality

9.5

Value

7.0

Features

Benchmark Scores

Benchmarks for DeepSeek V4-Pro (launched 2026-04-24; scores below are the V3.2 baseline pending third-party V4 verification, which typically lands 3-7 days post-launch)

Chatbot Arena ELOHuman preference rating1380

Benchmark	Description	Score
MMLU	Knowledge across 57 subjects	90.8%
MMLU-Pro	Harder multi-subject reasoning	85%
GPQA Diamond	Graduate-level science questions	79.9%
HumanEval	Python code generation	91.5%
SWE-bench	Real GitHub issue fixing	67.8%

Last updated: 2026-04-24

Visit DeepSeek

Personality & Tone

The open-source reasoning specialist

Tone: Direct and technical. DeepSeek's chat models give compact, math- and code-first answers and are noticeably less chatty than Claude or ChatGPT. When asked to reason, they expose a lot of visible thinking.

Quirks: Refusal patterns differ from Western models -- more permissive on many technical and gray-area prompts, more cautious on China-specific political questions. Community-tuned variants exist with different system prompts and guardrails.

The Good and the Bad

What we like

+Pricing is absurdly cheap compared to GPT-4 or Claude -- we're talking 90%+ savings on API calls
+DeepSeek-R1 reasoning model genuinely competes with o1 and o3 on math and coding benchmarks
+Fully open-source weights mean you can run it locally or fine-tune for your own use case
+130M+ users and growing fast, so the ecosystem and community support are solid

What could be better

−Censorship on politically sensitive topics is real and unavoidable -- it's a Chinese company subject to PRC regulations
−English output quality is good but noticeably behind Claude or GPT-4 for nuanced writing tasks
−Hallucinations on niche or domain-specific topics happen more often than with top-tier Western models
−Service reliability has been spotty during high-demand periods -- the free tier especially suffers from rate limiting

Pricing

Free

✓Web chat access at chat.deepseek.com
✓V4-Flash by default (as of 2026-04-24 launch)
✓Basic usage limits

API -- V4-Flash

$0.14/$0.28/per 1M tokens input/output

✓284B total / 13B active MoE
✓1M native context
✓Cheapest frontier-class API on market
✓Pay-as-you-go, no minimum

API -- V4-Pro (75% PROMO active through 2026-05-31)

$0.435/$0.87/per 1M tokens input/output (promotional)

✓1.6T total / 49B active MoE
✓1M native context
✓Trails only Gemini 3.1 Pro on world knowledge benchmarks
✓PROMO PRICING active through 2026-05-31 15:59 UTC -- 75% off list ($1.74/$3.48). Cache-hit input drops to $0.003625/M during promo
✓Post-promo pricing reverts to $1.74/$3.48 per 1M -- still 3-10x cheaper than GPT-5.5 or Claude Opus 4.7

Self-hosted (open-source)

$0 + GPU costs

✓MIT license, open weights on HuggingFace
✓V4-Flash is feasible on consumer hardware with quantization
✓V4-Pro needs multi-GPU production infrastructure

System Requirements

Hardware needed to self-host. Min = smallest viable setup (usually heavy quantization). Max = full-precision / production-grade.

Model variant	Min	Max
DeepSeek V4-Flash (284B total, 13B active MoE)MIT license, open weights on HuggingFace. Flash is the accessible entry point -- feasible on enthusiast / workstation hardware	96 GB RAM + 1× RTX 3090/4090 (Q4 quantization, ~3-5 tok/s)	2× H100 FP8 or 1× H200 (FP8 production, fast)
DeepSeek V4-Pro (1.6T total, 49B active MoE)MIT license, open weights. Pro is production multi-GPU territory -- not feasible for individuals	512 GB RAM + 4× RTX 4090 (severe quantization, experimental)	16× H100 FP8 or 8× H200 (full 1.6T production)
DeepSeek V3.2 (671B total, 37B active MoE) -- prior version, still availableMIT license -- commercial use OK	192 GB RAM + 1× RTX 3090/4090 (IQ2_XXS offload, ~2 tok/s)	8× H100 FP8 or 4× H200 (full 671B, production)

Known Issues

Regional availability restrictions: EU, Canada, South Korea, Australia, and India issued formal restrictions or bans on deployment of DeepSeek-V3 and the enterprise API in Q1 2026 over data-residency concerns (traffic routing through mainland China). Germany's BSI confirmed classified metadata leak from a parliamentary pilot. If you're deploying DeepSeek in any of these jurisdictions, check local compliance guidance before shipping; self-hosted open-weights deployment is often the workaround but changes the operational pictureSource: National CSIRT/BSI statements (aggregated), Alibaba policy analysis · 2026-Q1
DeepSeek V4 SHIPPED 2026-04-24. Two-model family released simultaneously: V4-Pro (1.6T total / 49B active MoE) and V4-Flash (284B / 13B active MoE). Both default to 1M context natively, use DeepSeek's new Hybrid Attention Architecture, and are open-sourced on HuggingFace under MIT license. V4-Pro trails only Gemini 3.1 Pro on world-knowledge benchmarks per early third-party runs. API pricing: Flash $0.14/$0.28, Pro $1.74/$3.48 per 1M tokens -- still 3-10x cheaper than Western frontier models. Tier-1 coverage: Bloomberg, CNBC, TechCrunch, Simon Willison blog. This closes out the 'V4 imminent' watchlist item that was open since 2026-04-03 Reuters pre-reportSource: DeepSeek API docs, Bloomberg, CNBC, TechCrunch, Simon Willison · 2026-04-24
PROMO: DeepSeek V4-Pro is 75% off through 2026-05-31 15:59 UTC per the official pricing page (api-docs.deepseek.com/quick_start/pricing). Effective rates during promo: $0.435 input / $0.87 output per 1M tokens (vs $1.74 / $3.48 list); cache-hit input drops to $0.003625/M. After 2026-05-31 reverts to standard pricing. Bloomberg framed the move as a 'Chinese price war' against frontier-model rates from OpenAI / Anthropic / Google. Worth locking in agentic-coding workloads now if you're cost-sensitiveSource: DeepSeek pricing docs (api-docs.deepseek.com/quick_start/pricing), Bloomberg · 2026-04-27
Third-party verification (T+3 days post-launch): Artificial Analysis Intelligence Index pegs V4-Pro at 52 (#2 open-weight, behind Kimi K2.6) and V4-Flash at 47. Vals AI: V4 is #1 open-weight on Vibe Code Bench 'and it's not close', plus #1 open-weight on SWE-bench. SWE-bench Verified 80.6% (effectively tied with Claude Opus 4.6's 80.8%). Codeforces 3206 surpasses GPT-5.4 (3168) -- highest competitive-programming score at release. GDPval-AA agentic 1554 leads all open-weight models. BUT LMSYS Chatbot Arena Elo around 1220 places V4-Pro alongside GPT-4o and Claude 4 Sonnet, not at the Opus-class frontier (1280+). Simon Willison's pelican-SVG community test produced visibly weak output from V4-Pro (one wing, oversized body) and concluded V4-Pro is 3-6 months behind US frontier labs at a fraction of the cost. Practical verdict: best-in-class open-weight for code/agents/math, mid-pack for general chat quality, weakest for creative/visual generation. Hallucination rate 94%/96% (Pro/Flash) per AA-Omniscience -- caveat for fact-sensitive workloadsSource: Artificial Analysis, Vals AI, Simon Willison, LMSYS Chatbot Arena, Codeforces · 2026-04-27
Refuses to engage with questions about Tiananmen Square, Taiwan sovereignty, and other politically sensitive topics per Chinese regulationsSource: Reddit r/LocalLLaMA · 2026-01
API latency spikes during peak hours, sometimes timing out entirely on longer reasoning chainsSource: GitHub Issues · 2026-02

Best for

Developers and teams who need strong reasoning and coding capabilities on a budget. If you're building AI features and can't justify GPT-4 API costs, DeepSeek is the obvious first stop.

Not for

Anyone working on content that touches geopolitical topics, or teams that need guaranteed uptime and enterprise SLAs. Also not ideal if your primary use case is creative English writing.

Our Verdict

DeepSeek is the real deal when it comes to bang-for-your-buck AI. The reasoning capabilities are legitimately impressive, and the open-source angle gives it a flexibility that closed models can't match. The censorship limitations are a dealbreaker for some use cases, and the writing quality trails behind Claude and GPT-4. But for coding, math, and analytical tasks? It's hard to argue with near-frontier performance at a fraction of the cost.

Sources

DeepSeek V4 API launch announcement (2026-04-24) (accessed 2026-04-24)
Bloomberg: DeepSeek unveils newest flagship (2026-04-24) (accessed 2026-04-24)
CNBC: DeepSeek V4 LLM preview (2026-04-24) (accessed 2026-04-24)
TechCrunch: DeepSeek V4 closes gap with frontier (2026-04-24) (accessed 2026-04-24)
Simon Willison: DeepSeek V4 (accessed 2026-04-24)
Artificial Analysis: DeepSeek V4 Pro + Flash leading open weights (accessed 2026-04-27)
Vals AI: DeepSeek V4-Pro model card (accessed 2026-04-27)
DeepSeek pricing docs (75% V4-Pro promo through 2026-05-31) (accessed 2026-04-28)
DeepSeek official site (accessed 2026-04-24)
Artificial Analysis benchmarks (accessed 2026-04-24)