DeepSeek: DeepSeek V4 Flash
deepseek/deepseek-v4-flashCreated Apr 24, 2026|1M context context|Starting at $0.1/M input tokens|Starting at $0.2/M output tokens
DeepSeek V4 Flash is an efficiency-focused Mixture-of-Experts model built for fast inference, high-throughput applications, coding assistants, chat systems, and agent workflows. It keeps strong reasoning and coding performance while prioritizing responsiveness and cost efficiency.
Providers for DeepSeek V4 Flash
Routes requests to the best providers that are able to handle your prompt size and parameters.
| Provider | Context Window | Max Output | Input Price | Output Price | Cache Read | Cache Write |
|---|---|---|---|---|---|---|
DeepInfra | 1048.6K | 16.4K | $0.1 | $0.2 | $0.02 | — |