DeepSeek logo
A

DeepSeek

A Tier · 8.0/10

DeepSeek V4 shipped 2026-04-24: V4-Pro (1.6T/49B active MoE) + V4-Flash (284B/13B active), 1M native context, Hybrid Attention Architecture, open-source on HF. Trails only Gemini 3.1 Pro on world knowledge

Last updated: 2026-04-28Free tier available

Score Breakdown

7.5
Ease of Use
8.0
Output Quality
9.5
Value
7.0
Features

Benchmark Scores

Benchmarks for DeepSeek V4-Pro (launched 2026-04-24; scores below are the V3.2 baseline pending third-party V4 verification, which typically lands 3-7 days post-launch)

Chatbot Arena ELOHuman preference rating1380
BenchmarkScore
MMLU90.8%
MMLU-Pro85%
GPQA Diamond79.9%
HumanEval91.5%
SWE-bench67.8%

Last updated: 2026-04-24

Personality & Tone

The open-source reasoning specialist

Tone: Direct and technical. DeepSeek's chat models give compact, math- and code-first answers and are noticeably less chatty than Claude or ChatGPT. When asked to reason, they expose a lot of visible thinking.

Quirks: Refusal patterns differ from Western models -- more permissive on many technical and gray-area prompts, more cautious on China-specific political questions. Community-tuned variants exist with different system prompts and guardrails.

The Good and the Bad

What we like

  • +Pricing is absurdly cheap compared to GPT-4 or Claude -- we're talking 90%+ savings on API calls
  • +DeepSeek-R1 reasoning model genuinely competes with o1 and o3 on math and coding benchmarks
  • +Fully open-source weights mean you can run it locally or fine-tune for your own use case
  • +130M+ users and growing fast, so the ecosystem and community support are solid

What could be better

  • Censorship on politically sensitive topics is real and unavoidable -- it's a Chinese company subject to PRC regulations
  • English output quality is good but noticeably behind Claude or GPT-4 for nuanced writing tasks
  • Hallucinations on niche or domain-specific topics happen more often than with top-tier Western models
  • Service reliability has been spotty during high-demand periods -- the free tier especially suffers from rate limiting

Pricing

Free

$0
  • Web chat access at chat.deepseek.com
  • V4-Flash by default (as of 2026-04-24 launch)
  • Basic usage limits

API -- V4-Flash

$0.14/$0.28/per 1M tokens input/output
  • 284B total / 13B active MoE
  • 1M native context
  • Cheapest frontier-class API on market
  • Pay-as-you-go, no minimum

API -- V4-Pro (75% PROMO active through 2026-05-31)

$0.435/$0.87/per 1M tokens input/output (promotional)
  • 1.6T total / 49B active MoE
  • 1M native context
  • Trails only Gemini 3.1 Pro on world knowledge benchmarks
  • PROMO PRICING active through 2026-05-31 15:59 UTC -- 75% off list ($1.74/$3.48). Cache-hit input drops to $0.003625/M during promo
  • Post-promo pricing reverts to $1.74/$3.48 per 1M -- still 3-10x cheaper than GPT-5.5 or Claude Opus 4.7

Self-hosted (open-source)

$0 + GPU costs
  • MIT license, open weights on HuggingFace
  • V4-Flash is feasible on consumer hardware with quantization
  • V4-Pro needs multi-GPU production infrastructure

System Requirements

Hardware needed to self-host. Min = smallest viable setup (usually heavy quantization). Max = full-precision / production-grade.

Model variantMinMax
DeepSeek V4-Flash (284B total, 13B active MoE)MIT license, open weights on HuggingFace. Flash is the accessible entry point -- feasible on enthusiast / workstation hardware96 GB RAM + 1× RTX 3090/4090 (Q4 quantization, ~3-5 tok/s)2× H100 FP8 or 1× H200 (FP8 production, fast)
DeepSeek V4-Pro (1.6T total, 49B active MoE)MIT license, open weights. Pro is production multi-GPU territory -- not feasible for individuals512 GB RAM + 4× RTX 4090 (severe quantization, experimental)16× H100 FP8 or 8× H200 (full 1.6T production)
DeepSeek V3.2 (671B total, 37B active MoE) -- prior version, still availableMIT license -- commercial use OK192 GB RAM + 1× RTX 3090/4090 (IQ2_XXS offload, ~2 tok/s)8× H100 FP8 or 4× H200 (full 671B, production)

Known Issues

  • Regional availability restrictions: EU, Canada, South Korea, Australia, and India issued formal restrictions or bans on deployment of DeepSeek-V3 and the enterprise API in Q1 2026 over data-residency concerns (traffic routing through mainland China). Germany's BSI confirmed classified metadata leak from a parliamentary pilot. If you're deploying DeepSeek in any of these jurisdictions, check local compliance guidance before shipping; self-hosted open-weights deployment is often the workaround but changes the operational pictureSource: National CSIRT/BSI statements (aggregated), Alibaba policy analysis · 2026-Q1
  • DeepSeek V4 SHIPPED 2026-04-24. Two-model family released simultaneously: V4-Pro (1.6T total / 49B active MoE) and V4-Flash (284B / 13B active MoE). Both default to 1M context natively, use DeepSeek's new Hybrid Attention Architecture, and are open-sourced on HuggingFace under MIT license. V4-Pro trails only Gemini 3.1 Pro on world-knowledge benchmarks per early third-party runs. API pricing: Flash $0.14/$0.28, Pro $1.74/$3.48 per 1M tokens -- still 3-10x cheaper than Western frontier models. Tier-1 coverage: Bloomberg, CNBC, TechCrunch, Simon Willison blog. This closes out the 'V4 imminent' watchlist item that was open since 2026-04-03 Reuters pre-reportSource: DeepSeek API docs, Bloomberg, CNBC, TechCrunch, Simon Willison · 2026-04-24
  • PROMO: DeepSeek V4-Pro is 75% off through 2026-05-31 15:59 UTC per the official pricing page (api-docs.deepseek.com/quick_start/pricing). Effective rates during promo: $0.435 input / $0.87 output per 1M tokens (vs $1.74 / $3.48 list); cache-hit input drops to $0.003625/M. After 2026-05-31 reverts to standard pricing. Bloomberg framed the move as a 'Chinese price war' against frontier-model rates from OpenAI / Anthropic / Google. Worth locking in agentic-coding workloads now if you're cost-sensitiveSource: DeepSeek pricing docs (api-docs.deepseek.com/quick_start/pricing), Bloomberg · 2026-04-27
  • Third-party verification (T+3 days post-launch): Artificial Analysis Intelligence Index pegs V4-Pro at 52 (#2 open-weight, behind Kimi K2.6) and V4-Flash at 47. Vals AI: V4 is #1 open-weight on Vibe Code Bench 'and it's not close', plus #1 open-weight on SWE-bench. SWE-bench Verified 80.6% (effectively tied with Claude Opus 4.6's 80.8%). Codeforces 3206 surpasses GPT-5.4 (3168) -- highest competitive-programming score at release. GDPval-AA agentic 1554 leads all open-weight models. BUT LMSYS Chatbot Arena Elo around 1220 places V4-Pro alongside GPT-4o and Claude 4 Sonnet, not at the Opus-class frontier (1280+). Simon Willison's pelican-SVG community test produced visibly weak output from V4-Pro (one wing, oversized body) and concluded V4-Pro is 3-6 months behind US frontier labs at a fraction of the cost. Practical verdict: best-in-class open-weight for code/agents/math, mid-pack for general chat quality, weakest for creative/visual generation. Hallucination rate 94%/96% (Pro/Flash) per AA-Omniscience -- caveat for fact-sensitive workloadsSource: Artificial Analysis, Vals AI, Simon Willison, LMSYS Chatbot Arena, Codeforces · 2026-04-27
  • Refuses to engage with questions about Tiananmen Square, Taiwan sovereignty, and other politically sensitive topics per Chinese regulationsSource: Reddit r/LocalLLaMA · 2026-01
  • API latency spikes during peak hours, sometimes timing out entirely on longer reasoning chainsSource: GitHub Issues · 2026-02

Best for

Developers and teams who need strong reasoning and coding capabilities on a budget. If you're building AI features and can't justify GPT-4 API costs, DeepSeek is the obvious first stop.

Not for

Anyone working on content that touches geopolitical topics, or teams that need guaranteed uptime and enterprise SLAs. Also not ideal if your primary use case is creative English writing.

Our Verdict

DeepSeek is the real deal when it comes to bang-for-your-buck AI. The reasoning capabilities are legitimately impressive, and the open-source angle gives it a flexibility that closed models can't match. The censorship limitations are a dealbreaker for some use cases, and the writing quality trails behind Claude and GPT-4. But for coding, math, and analytical tasks? It's hard to argue with near-frontier performance at a fraction of the cost.

Sources

  • DeepSeek V4 API launch announcement (2026-04-24) (accessed 2026-04-24)
  • Bloomberg: DeepSeek unveils newest flagship (2026-04-24) (accessed 2026-04-24)
  • CNBC: DeepSeek V4 LLM preview (2026-04-24) (accessed 2026-04-24)
  • TechCrunch: DeepSeek V4 closes gap with frontier (2026-04-24) (accessed 2026-04-24)
  • Simon Willison: DeepSeek V4 (accessed 2026-04-24)
  • Artificial Analysis: DeepSeek V4 Pro + Flash leading open weights (accessed 2026-04-27)
  • Vals AI: DeepSeek V4-Pro model card (accessed 2026-04-27)
  • DeepSeek pricing docs (75% V4-Pro promo through 2026-05-31) (accessed 2026-04-28)
  • DeepSeek official site (accessed 2026-04-24)
  • Artificial Analysis benchmarks (accessed 2026-04-24)

The Tier List Tuesday

Weekly newsletter: tier movers, new entrants, and the VS of the week. Built from our daily AI-tool sweeps. No spam, unsubscribe anytime.

Alternatives to DeepSeek

Llama 4 (Meta) logo

Llama 4 (Meta)

Meta's open-weights flagship family -- Scout (10M context), Maverick (multimodal 400B MoE), Behemoth in preview

B
7.9/10
Free tierFrom $0
Llama 4 Scout has a 10M token context wi...Llama 4 Maverick is natively multimodal ...
Updated 2026-04-13
Mistral AI logo

Mistral AI

European AI lab with open and commercial models -- Mistral Medium 3.5 SHIPPED 2026-04-29 (128B dense, 256k context, 77.6% SWE-Bench Verified) plus Vibe Remote Agents + Le Chat Work Mode. Earlier 2026 line: Small 4 (Mar 2026 119B MoE Apache 2.0 unified), Medium 3 (Apr 9 2026), Voxtral TTS (Mar 2026 open-source speech)

B
7.5/10
Free tierFrom $0
Mistral Medium 3.5 (April 29 2026) is Mi...Vibe Remote Agents (also 4/29) lets you ...
Updated 2026-05-04
Gemma 4 (Google) logo

Gemma 4 (Google)

Google DeepMind's open-weights model family -- multimodal, 256K context, runs on edge devices

A
8.3/10
Free tierFrom $0
Apache 2.0 license -- truly permissive, ...Multimodal: handles text + image input (...
Updated 2026-04-19
Qwen (Alibaba) logo

Qwen (Alibaba)

Alibaba's open-weights + API family -- Qwen3.6-27B dense (Apr 22 2026 Apache 2.0, beats the 397B MoE flagship on coding from a single consumer GPU), Qwen 3.6-Max-Preview (Apr 20 2026 closed-weights #1 on SWE-bench Pro/Terminal-Bench 2.0/SciCode), Qwen3.6-35B-A3B (Apr 16 open-weights MoE), plus Qwen 3.6-Plus API flagship

A
8.8/10
Free tierFrom $0
Qwen 3.6-Plus (launched Mar 30 2026) is ...Qwen3.5 Small (0.8B / 2B / 4B / 9B) is t...
Updated 2026-04-27
GLM / Z.ai (Zhipu AI) logo

GLM / Z.ai (Zhipu AI)

Zhipu AI's open-weights family -- GLM-5.1 (launched 2026-04-07) is 744B MoE / 40B active, topped SWE-Bench Pro at 58.4 (beating GPT-5.4 and Claude Opus 4.6), MIT licensed, 200K context. Trained entirely on 100K Huawei Ascend 910B chips -- first frontier model with zero Nvidia in the training stack

A
8.0/10
Free tierFrom $0
GLM-5.1 (2026-04-07) topped SWE-Bench Pr...First frontier model trained entirely on...
Updated 2026-04-17
Kimi K2.6 (Moonshot) logo

Kimi K2.6 (Moonshot)

Moonshot's 1T-parameter MoE open-weights flagship -- Kimi K2.6 (GA 2026-04-20) is #1 open-weights on Artificial Analysis Intelligence Index v4.0 (score 54, ranked #4 overall). Native video input, 256K context, Modified MIT license

A
8.1/10
Free tierFrom $0
Frontier-tier performance -- Elo 1309 on...Beats Claude Opus 4.5 on several coding ...
Updated 2026-05-13
Nemotron (Nvidia) logo

Nemotron (Nvidia)

Nvidia's open-weights family -- hybrid Mamba-Transformer MoE architecture, optimized for efficient reasoning on Nvidia hardware

B
7.8/10
Free tierFrom $0
Hybrid Mamba-Transformer architecture dr...Nemotron 3 Super activates only 3.6B par...
Updated 2026-04-19
MiniMax M2.7 logo

MiniMax M2.7

MiniMax's open-weights self-evolving agent flagship -- M2.7 (released 2026-03-18) scores 56.22% SWE-Pro and 57.0% Terminal Bench 2 from a 229B/10B-active MoE

A
8.4/10
Free tierFrom $0
229B/10B-active MoE delivers Tier-1 agen...Sparse MoE design: ~10B active params du...
Updated 2026-04-27
Falcon (TII) logo

Falcon (TII)

UAE's Technology Innovation Institute open-weights family -- Falcon 3 optimized for efficient sub-10B deployment on consumer hardware

B
7.1/10
Free tierFrom $0
Apache 2.0 license -- fully permissive f...Sub-10B sizes run on any consumer GPU or...
Updated 2026-04-13
gpt-oss (OpenAI) logo

gpt-oss (OpenAI)

OpenAI's FIRST open-weight models -- gpt-oss-120b (single 80GB GPU, near parity with o4-mini on reasoning) and gpt-oss-20b (runs on 16GB edge devices). Apache 2.0. Launched 2025-08-05. gpt-oss-safeguard ships in 2026 as the safety-tuned variant

A
8.1/10
Free tierFrom $0
First-ever OpenAI open-weight release --...gpt-oss-120b approaches o4-mini on reaso...
Updated 2026-04-17
IBM Granite 4.0 logo

IBM Granite 4.0

IBM's enterprise-focused open-weight family -- Granite 4.0 hybrid Mamba-2 + transformer architecture (70-80% memory reduction vs pure transformer), 3B to 32B sizes, Apache 2.0. First open model family to secure ISO 42001 certification. Nano 350M runs on CPU with 8-16GB RAM. 3B Vision variant landed 2026-04-01

A
8.2/10
Free tierFrom $0
Hybrid Mamba-2 + transformer architectur...Granite 4.0 Nano (350M and 1.5B) is genu...
Updated 2026-04-17
Arcee Trinity-Large-Thinking logo

Arcee Trinity-Large-Thinking

Arcee AI's US-made open-weight frontier reasoning model -- launched 2026-04-01. 398B total params, ~13B active. Sparse MoE (256 experts, 4 active = 1.56% routing). Apache 2.0, trained from scratch. #2 on PinchBench trailing only Claude 3.5 Opus. ~96% cheaper than Opus-4.6 on agentic tasks

A
8.1/10
Free tierFrom $0
Rare US-made frontier-tier open-weight r...Trained from scratch (not a fine-tune) a...
Updated 2026-04-17
Olmo 3 (AI2) logo

Olmo 3 (AI2)

Allen Institute for AI's fully-open frontier reasoning models -- Olmo 3 family (2025-11-20) includes 7B and 32B sizes, four variants (Base, Think, Instruct, RLZero). Apache 2.0 with fully open data + checkpoints + training logs. Olmo 3-Think 32B matches Qwen3-32B-Thinking at 6x fewer training tokens

B
7.9/10
Free tierFrom $0
FULLY OPEN is a different category than ...Olmo 3-Think 32B matches Qwen3-32B-Think...
Updated 2026-04-17
AI21 Jamba2 logo

AI21 Jamba2

AI21 Labs' hybrid SSM-Transformer (Mamba-style) open-weight family -- Jamba2 launched 2026-01-08. Two sizes: 3B dense (runs on phones / laptops) and Jamba2 Mini MoE (12B active / 52B total). Apache 2.0, 256K context, mid-trained on 500B tokens

A
8.0/10
Free tierFrom $0
Hybrid SSM-Transformer (Mamba-style) arc...Jamba2 3B dense runs realistically on iP...
Updated 2026-04-17
StepFun Step 3.5 Flash logo

StepFun Step 3.5 Flash

StepFun's (China) agent-focused open-weight model -- Step 3.5 Flash launched 2026-02-01. 196B sparse MoE, ~11B active. Benchmarks slightly ahead of DeepSeek V3.2 at over 3x smaller total size. Step 3 (321B / 38B active, Apache 2.0) and Step3-VL-10B multimodal also in the family

B
7.8/10
Free tierFrom $0
Step 3.5 Flash at 196B total / 11B activ...Agent-focused tuning explicitly -- tool ...
Updated 2026-04-17
Cohere Command A logo

Cohere Command A

Cohere's enterprise-multilingual flagship -- 111B params, 256K context, runs on 2x H100. 23 languages. CC-BY-NC 4.0 on weights (research / non-commercial), commercial requires Cohere enterprise contract. Follow-ups: Command A Reasoning + Command A Vision

B
7.5/10
Free tierFrom $0
Best-in-class multilingual open-weight m...Runs on just 2x H100 at FP16 for the ful...
Updated 2026-04-17