Streaming Guide

How to consume SSE streams from the Chat Completions API

Overview

When you send a request to POST /v1/chat/completions with stream: true (the default), the response is delivered as Server-Sent Events (SSE). Each event is a line prefixed with "data: " followed by a JSON chunk, and the stream ends with "data: [DONE]".

Stream Lifecycle

A stream goes through several phases in order:

1. role chunk        {"delta": {"role": "assistant"}}
2. reasoning chunks  {"delta": {"reasoning_steps": [...]}}  (thinking, tool use)
3. content chunks    {"delta": {"content": "..."}}          (final answer)
4. finish chunk      {"delta": {}, "finish_reason": "stop", "usage": {...}}
5. [DONE] signal     data: [DONE]

Phase 1: Reasoning

The model emits reasoning_steps in the delta — thinking, web search, code execution, tool calls. This phase may take time as the model uses tools.

Phase 2: Final Answer

After reasoning completes, the final answer streams via delta.content token by token, just like standard OpenAI streaming. The last chunk carries finish_reason and usage.

Chunk Format

Each SSE line contains a JSON chunk following the OpenAI chat.completion.chunk format:

SSE Format
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"mirothinker-1-7-deepresearch-mini","created":1712345678,"choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"mirothinker-1-7-deepresearch-mini","created":1712345678,"choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":{"prompt_tokens":10,"completion_tokens":100,"total_tokens":110}}

data: [DONE]

Chunk Fields

idUnique identifier for this completion (format: chatcmpl-{workflow_id})
objectAlways "chat.completion.chunk"
modelThe model used for this completion
choices[].deltaIncremental content — may contain role, content, or reasoning_steps
choices[].finish_reasonnull during streaming, then "stop", "error", or "cancelled" in the final chunk
usageToken usage stats — only present in the final chunk (prompt_tokens, completion_tokens, total_tokens, reasoning_tokens, num_search_queries)

Delta Types

The delta object in each chunk contains one of the following:

{"delta": {"role": "assistant"}}
Heartbeat

During long pauses (agent thinking, tool execution), the server sends SSE comment lines as keep-alive heartbeats to prevent proxy/CDN idle timeouts:

: heartbeat

SSE comments (lines starting with ":") are part of the SSE spec and are automatically ignored by EventSource clients. If you parse SSE manually, skip lines that start with ":".

Complete Example

A complete streaming client that handles reasoning steps, content, finish reason, usage, and heartbeats:

Standard usage (OpenAI SDK)

Uses the official OpenAI SDK. Returns standard fields: content, finish_reason, usage. The SDK drops unknown fields, so reasoning_steps / citations / search_results are not accessible this way.

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.MIROMIND_API_KEY,
  baseURL: 'https://api.miromind.ai/v1',
});

const stream = await client.chat.completions.create({
  model: 'mirothinker-1-7-deepresearch-mini',
  messages: [{ role: 'user', content: 'Hello' }],
  stream: true,
});

for await (const chunk of stream) {
  const choice = chunk.choices[0];
  const delta = choice.delta;

  // Final answer (token-by-token)
  if (delta?.content) {
    process.stdout.write(delta.content);
  }

  // Finish + usage (last chunk)
  if (choice.finish_reason) {
    console.log(`\nFinish: ${choice.finish_reason}`);
    if (chunk.usage) console.log('Usage:', chunk.usage);
  }
}

Reading extension fields (raw HTTP)

Parse the SSE stream manually to consume our extensions: reasoning_steps (thinking, web_search, fetch_url_content, execute_python, execute_command), citations, and search_results.

const API_KEY = process.env.MIROMIND_API_KEY;

async function streamChatCompletion(messages) {
  const response = await fetch('https://api.miromind.ai/v1/chat/completions', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'mirothinker-1-7-deepresearch-mini',
      messages,
      stream: true,
    }),
  });

  if (!response.ok) {
    throw new Error(`HTTP ${response.status}: ${await response.text()}`);
  }

  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  let buffer = '';

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    buffer += decoder.decode(value, { stream: true });
    const lines = buffer.split('\n');
    buffer = lines.pop(); // keep incomplete line in buffer

    for (const line of lines) {
      // Skip SSE comments (heartbeats)
      if (line.startsWith(':')) continue;

      if (!line.startsWith('data: ')) continue;
      const payload = line.slice(6);

      if (payload === '[DONE]') {
        console.log('\nStream finished');
        return;
      }

      const chunk = JSON.parse(payload);
      const choice = chunk.choices?.[0];
      const delta = choice?.delta;

      // Reasoning steps (thinking, tool use)
      if (delta?.reasoning_steps) {
        for (const step of delta.reasoning_steps) {
          if (step.type === 'thinking') {
            console.log('[thinking]', step.thought);
          } else {
            console.log(`[${step.type}]`, JSON.stringify(step));
          }
        }
      }

      // Content tokens (final answer)
      if (delta?.content) {
        process.stdout.write(delta.content);
      }

      // Finish
      if (choice?.finish_reason) {
        console.log(`\nFinish reason: ${choice.finish_reason}`);
        if (chunk.usage) {
          console.log('Usage:', JSON.stringify(chunk.usage));
        }
      }
    }
  }
}
Best Practices
  • Buffer incomplete lines — SSE data may arrive in partial TCP chunks
  • Skip SSE comment lines (starting with ":") — these are heartbeats
  • Check finish_reason in each chunk — "error" means the workflow failed, and the chunk includes an error object
  • Read usage from the final chunk to track token consumption and costs
  • Handle client disconnects gracefully — the server will automatically cancel the workflow if the connection drops
Next Steps

See the full API reference for request parameters, response schema, and reasoning step types.

MiroMindAI Platform - Build Together. Evolve Forever.