Streaming Guide
How to consume SSE streams from the Chat Completions API
When you send a request to POST /v1/chat/completions with stream: true (the default), the response is delivered as Server-Sent Events (SSE). Each event is a line prefixed with "data: " followed by a JSON chunk, and the stream ends with "data: [DONE]".
Stream Lifecycle
A stream goes through several phases in order:
1. role chunk {"delta": {"role": "assistant"}}
2. reasoning chunks {"delta": {"reasoning_steps": [...]}} (thinking, tool use)
3. content chunks {"delta": {"content": "..."}} (final answer)
4. finish chunk {"delta": {}, "finish_reason": "stop", "usage": {...}}
5. [DONE] signal data: [DONE]Phase 1: Reasoning
The model emits reasoning_steps in the delta — thinking, web search, code execution, tool calls. This phase may take time as the model uses tools.
Phase 2: Final Answer
After reasoning completes, the final answer streams via delta.content token by token, just like standard OpenAI streaming. The last chunk carries finish_reason and usage.
Chunk Format
Each SSE line contains a JSON chunk following the OpenAI chat.completion.chunk format:
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"mirothinker-1-7-deepresearch-mini","created":1712345678,"choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"mirothinker-1-7-deepresearch-mini","created":1712345678,"choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":{"prompt_tokens":10,"completion_tokens":100,"total_tokens":110}}
data: [DONE]Chunk Fields
idUnique identifier for this completion (format: chatcmpl-{workflow_id})objectAlways "chat.completion.chunk"modelThe model used for this completionchoices[].deltaIncremental content — may contain role, content, or reasoning_stepschoices[].finish_reasonnull during streaming, then "stop", "error", or "cancelled" in the final chunkusageToken usage stats — only present in the final chunk (prompt_tokens, completion_tokens, total_tokens, reasoning_tokens, num_search_queries)Delta Types
The delta object in each chunk contains one of the following:
{"delta": {"role": "assistant"}}During long pauses (agent thinking, tool execution), the server sends SSE comment lines as keep-alive heartbeats to prevent proxy/CDN idle timeouts:
: heartbeat
SSE comments (lines starting with ":") are part of the SSE spec and are automatically ignored by EventSource clients. If you parse SSE manually, skip lines that start with ":".
Complete Example
A complete streaming client that handles reasoning steps, content, finish reason, usage, and heartbeats:
Standard usage (OpenAI SDK)
Uses the official OpenAI SDK. Returns standard fields: content, finish_reason, usage. The SDK drops unknown fields, so reasoning_steps / citations / search_results are not accessible this way.
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.MIROMIND_API_KEY,
baseURL: 'https://api.miromind.ai/v1',
});
const stream = await client.chat.completions.create({
model: 'mirothinker-1-7-deepresearch-mini',
messages: [{ role: 'user', content: 'Hello' }],
stream: true,
});
for await (const chunk of stream) {
const choice = chunk.choices[0];
const delta = choice.delta;
// Final answer (token-by-token)
if (delta?.content) {
process.stdout.write(delta.content);
}
// Finish + usage (last chunk)
if (choice.finish_reason) {
console.log(`\nFinish: ${choice.finish_reason}`);
if (chunk.usage) console.log('Usage:', chunk.usage);
}
}Reading extension fields (raw HTTP)
Parse the SSE stream manually to consume our extensions: reasoning_steps (thinking, web_search, fetch_url_content, execute_python, execute_command), citations, and search_results.
const API_KEY = process.env.MIROMIND_API_KEY;
async function streamChatCompletion(messages) {
const response = await fetch('https://api.miromind.ai/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': `Bearer ${API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'mirothinker-1-7-deepresearch-mini',
messages,
stream: true,
}),
});
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${await response.text()}`);
}
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop(); // keep incomplete line in buffer
for (const line of lines) {
// Skip SSE comments (heartbeats)
if (line.startsWith(':')) continue;
if (!line.startsWith('data: ')) continue;
const payload = line.slice(6);
if (payload === '[DONE]') {
console.log('\nStream finished');
return;
}
const chunk = JSON.parse(payload);
const choice = chunk.choices?.[0];
const delta = choice?.delta;
// Reasoning steps (thinking, tool use)
if (delta?.reasoning_steps) {
for (const step of delta.reasoning_steps) {
if (step.type === 'thinking') {
console.log('[thinking]', step.thought);
} else {
console.log(`[${step.type}]`, JSON.stringify(step));
}
}
}
// Content tokens (final answer)
if (delta?.content) {
process.stdout.write(delta.content);
}
// Finish
if (choice?.finish_reason) {
console.log(`\nFinish reason: ${choice.finish_reason}`);
if (chunk.usage) {
console.log('Usage:', JSON.stringify(chunk.usage));
}
}
}
}
}- Buffer incomplete lines — SSE data may arrive in partial TCP chunks
- Skip SSE comment lines (starting with ":") — these are heartbeats
- Check finish_reason in each chunk — "error" means the workflow failed, and the chunk includes an error object
- Read usage from the final chunk to track token consumption and costs
- Handle client disconnects gracefully — the server will automatically cancel the workflow if the connection drops
See the full API reference for request parameters, response schema, and reasoning step types.