Coming soon

DeepSeek V4 status hub

DeepSeek V4 is positioned as the next major step in the DeepSeek lineup, with a large-scale Mixture-of-Experts design and a strong focus on reasoning, code, and long-context workloads. The working profile centers on trillion-class capacity with sparse activation, balancing large knowledge coverage with production-ready throughput.

The expected performance focus spans code generation, math accuracy, and structured reasoning, with evaluation typically framed around benchmarks such as MMLU, HumanEval, GSM8K, and MATH. Multimodal capability remains a core goal, with deep image understanding and a roadmap toward richer video workflows. Official launch timing and final specifications are pending, so this page summarizes the target profile and what teams can do now.

Scale and MoE routing
Trillion-class capacity with sparse activation keeps inference practical at production scale.
Long-context reasoning
Targets emphasize 100K-class context for large documents, codebases, and multi-step workflows.
Code and math strength
Evaluation focus remains on code generation, math accuracy, and structured reasoning tasks.
Multimodal expansion
Image understanding is expected to deepen with a roadmap toward richer video capabilities.

Model snapshot

The V4 target profile emphasizes scale without runaway inference cost. A large MoE backbone keeps total capacity high while activating a smaller subset of experts per token. That balance is meant to preserve throughput for production use cases such as retrieval-augmented workflows, long-document analysis, and multi-step reasoning.

Total parameters
~1T
Mixture-of-Experts capacity target
Active parameters
~320B
Sparse activation per token
Expert layout
1 shared + 256 routed
Top-k routing (k=8) at inference
Context target
100K-class
Designed for long-form reasoning
Capability profile
Relative emphasis across core evaluation areas.
Reasoning depthMath and logic tasks
Code generationHigh precision outputs
Long-context handlingLarge documents
Multimodal readinessImage + video roadmap
EfficiencySparse compute path
Scale vs. activation
Sparse activation keeps inference efficient while retaining massive capacity.
Active parameters~320B
Total capacity~1T
The active slice is intentionally smaller than total capacity, enabling higher throughput without discarding large-scale knowledge coverage.

V4 briefings

Videos load on click to keep the page fast while you browse.

DeepSeek V4 briefing 01AIM Network
Revisits the market shock after DeepSeek R1 and frames V4 as a strategic shift: multimodal support, open access, and optimization for domestic chips (Huawei, Cambricon) rather than NVIDIA. It highlights China’s fast-moving model race (Qwen, Seed, Moonshot, Zhipu, MiniMax) and argues that efficiency plus local silicon could reshape the global AI balance.
DeepSeek V4 briefing 02The Information
Interview-style coverage emphasizing internal benchmark confidence in coding performance, tracing the V3 → R1 → V4 arc, and presenting V4 as a direct open-source challenge to closed-model leaders. It also notes the growing global footprint of open-source model usage led by China-based teams.
DeepSeek V4 briefing 03Fahd Mirza
Focuses on launch timing narratives and positions V4 as multimodal with deeper domestic chip collaboration. It revisits V3’s scale and cost advantage, rejects unreliable benchmark chatter, and frames the release window as strategically meaningful while noting competitive disputes around distillation and hardware access.
DeepSeek V4 briefing 04Universe of AI
A technical update sweep: reported 1M-token context, native multimodal support, potential Blackwell compatibility, and DeepGEMM advances (Manifold constraints, FP4 inference). It also references broader competitive signals and the intersection of capability, hardware, and geopolitics.
Shared themes
All four briefings frame DeepSeek V4 as a 2026 inflection point: top-tier capability with lower compute cost, strong multimodal ambition, and a hardware strategy that leans into domestic silicon. A suggested viewing order is Video 1 for macro impact, Video 2 for benchmark positioning, Video 3 for launch timing narratives, and Video 4 for the latest technical update set.

How to prepare today

V4 access will open after the official launch, but teams can prepare by validating workloads on the current DeepSeek lineup. Focus on prompt structures, evaluation harnesses, and routing strategies so that the switch to V4 is a controlled migration rather than a fresh integration. Keep your internal benchmarks aligned with code, math, and long-context tasks to make the eventual comparison straightforward.

  • Benchmark tasks using V3.1, R1, Math-7B, Janus-Pro-7B, and VL2.
  • Document latency, cost, and quality trade-offs per model.
  • Prepare evaluation sets for long-context and tool use.
Start with legacy models
Start with legacy models while V4 prepares for launch.
Legacy models are free for the first 30 days.
Switch models instantly inside the Playground.
Open PlaygroundView pricing