DeepSeek V4 Flash

Name: DeepSeek V4 Flash
Brand: DeepSeek
SKU: deepseek-v4-flash

DeepSeek V4 Flash is an efficiency-focused Mixture-of-Experts model built for fast inference, high-throughput applications, coding assistants, chat systems, and agent workflows. It keeps strong reasoning and coding performance while prioritizing responsiveness and cost efficiency.

deepseek-v4-flash

Open Relay View integration

relay/request.mjsready to route

const completion = await client.chat.completions.create({
  model: "deepseek-v4-flash",
  messages: [
    { role: "user",
      content: "Plan a multi-step task" }
  ]
});

CONTEXT1M contextprompt window

MAX OUTPUT16.4Kper response

INPUT$0.10 / Mtoken price

OUTPUT$0.20 / Mtoken price

Route configuration_

Use the model through the contract you already have.

PipeLLM resolves provider access, policy, and supported model capabilities without changing the interface your application calls.

ENDPOINThttps://api.pipellm.ai/openaiOpenAI-compatible route

MODEL IDdeepseek-v4-flashUse this in every request

GOVERNANCEPolicy controlledApprovals and tool rules apply

PROVIDER AVAILABILITY1 route

Provider	Context	Max output	Input / M	Output / M
DeepInfra	1M context	16.4K	$0.10 / M	$0.20 / M

Displayed pricing is per one million tokens and may vary by provider.

SUPPORTED SURFACESactive

input: textoutput: texttool use

Relay governs model access with the same routing policy that Lens keeps visible and auditable.