header-langage
简体中文
繁體中文
English
Tiếng Việt
한국어
日本語
ภาษาไทย
Türkçe
Scan to Download the APP

DeepSeek-V4 API Pricing: Flash Flat to V3.2 with a Price Cut, Context Translate 8x, Pro Price Up 8x but will be discounted by 950 in the second half of the year

According to Dynamic Insight Beating monitoring, the DeepSeek V4 API has been synchronized with V4-Pro and V4-Flash, and the official account has announced pricing and computational power planning.

V4-Flash directly replaces V3.2 (deepseek-chat) without a price increase; in fact, there has been a decrease: the cache hit input remains the same at 0.2 USD per million tokens, the cache miss input has decreased from 2 USD to 1 USD (a 50% reduction), and the output has decreased from 3 USD to 2 USD (a 33% reduction). The context has been expanded from 128K to 1M, meaning you can now get 8 times the context at a cheaper price. The two old model names, deepseek-chat and deepseek-reasoner, will be deprecated on July 24, 2026, and currently point to V4-Flash in non-reasoning and reasoning modes, respectively.

V4-Pro is a brand-new high-end tier: 1 USD for cache hit input, 12 USD for miss input, and 24 USD for output per million tokens, making the output price 8 times that of V3.2. DeepSeek explained in the pricing table notes that due to the limited high-end computational power, the throughput of Pro services is currently very restricted, but it is expected to undergo a significant price reduction in the second half of the year after the listing of 950 Ascend super nodes in bulk.

Both models support non-reasoning and reasoning modes, with reasoning mode offering a high/max setting for the reasoning_effort parameter. DeepSeek stated in the announcement that "from now on, 1M context will be the standard configuration for all official DeepSeek services."

举报 Correction/Report
Correction/Report
Submit
Add Library
Visible to myself only
Public
Save
Choose Library
Add Library
Cancel
Finish