Streaming Guide

How to consume SSE streams from the Chat Completions API

Overview

When you send a request to POST /v1/chat/completions with stream: true (the default), the response is delivered as Server-Sent Events (SSE). Each event is a line prefixed with "data: " followed by a JSON chunk, and the stream ends with "data: [DONE]".

This page covers the SSE wire format. For the full API reference (request parameters, response schema, reasoning step types), see the Chat Completions API page.

Stream Lifecycle

A stream goes through several phases in order:

1. role chunk        {"delta": {"role": "assistant"}}
2. reasoning chunks  {"delta": {"reasoning_steps": [...]}}  (thinking, tool use)
3. content chunks    {"delta": {"content": "..."}}          (final answer)
4. finish chunk      {"delta": {}, "finish_reason": "stop", "usage": {...}}
5. [DONE] signal     data: [DONE]

Phase 1: Reasoning

The model emits reasoning_steps in the delta — thinking, web search, code execution, tool calls. This phase may take time as the model uses tools.

Phase 2: Final Answer

After reasoning completes, the final answer streams via delta.content token by token, just like standard OpenAI streaming. The last chunk carries finish_reason and usage.

Chunk Format

Each SSE line contains a JSON chunk following the OpenAI chat.completion.chunk format:

SSE Format

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"mirothinker-1-7-deepresearch-mini","created":1712345678,"choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"mirothinker-1-7-deepresearch-mini","created":1712345678,"choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":{"prompt_tokens":10,"completion_tokens":100,"total_tokens":110}}

data: [DONE]

Chunk Fields

idUnique identifier for this completion (format: chatcmpl-{workflow_id})

objectAlways "chat.completion.chunk"

modelThe model used for this completion

choices[].deltaIncremental content — may contain role, content, or reasoning_steps

choices[].finish_reasonnull during streaming, then "stop", "error", or "cancelled" in the final chunk

usageToken usage stats — only present in the final chunk (prompt_tokens, completion_tokens, total_tokens, reasoning_tokens, num_search_queries)

Delta Types

The delta object in each chunk contains one of the following:

{"delta": {"role": "assistant"}}

Heartbeat

During long pauses (agent thinking, tool execution), the server sends SSE comment lines as keep-alive heartbeats to prevent proxy/CDN idle timeouts:

: heartbeat

SSE comments (lines starting with ":") are part of the SSE spec and are automatically ignored by EventSource clients. If you parse SSE manually, skip lines that start with ":".

Complete Example

A complete streaming client that handles reasoning steps, content, finish reason, usage, and heartbeats:

Standard usage (OpenAI SDK)

Uses the official OpenAI SDK. Returns standard fields: content, finish_reason, usage. The SDK drops unknown fields, so reasoning_steps / citations / search_results are not accessible this way.

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.MIROMIND_API_KEY,
  baseURL: 'https://api.miromind.ai/v1',
});

const stream = await client.chat.completions.create({
  model: 'mirothinker-1-7-deepresearch-mini',
  messages: [{ role: 'user', content: 'Hello' }],
  stream: true,
});

for await (const chunk of stream) {
  const choice = chunk.choices[0];
  const delta = choice.delta;

  // Final answer (token-by-token)
  if (delta?.content) {
    process.stdout.write(delta.content);
  }

  // Finish + usage (last chunk)
  if (choice.finish_reason) {
    console.log(`\nFinish: ${choice.finish_reason}`);
    if (chunk.usage) console.log('Usage:', chunk.usage);
  }
}

Reading extension fields (raw HTTP)

Parse the SSE stream manually to consume our extensions: reasoning_steps (thinking, web_search, fetch_url_content, execute_python, execute_command), citations, and search_results.

const API_KEY = process.env.MIROMIND_API_KEY;

async function streamChatCompletion(messages) {
  const response = await fetch('https://api.miromind.ai/v1/chat/completions', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'mirothinker-1-7-deepresearch-mini',
      messages,
      stream: true,
    }),
  });

  if (!response.ok) {
    throw new Error(`HTTP ${response.status}: ${await response.text()}`);
  }

  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  let buffer = '';

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    buffer += decoder.decode(value, { stream: true });
    const lines = buffer.split('\n');
    buffer = lines.pop(); // keep incomplete line in buffer

    for (const line of lines) {
      // Skip SSE comments (heartbeats)
      if (line.startsWith(':')) continue;

      if (!line.startsWith('data: ')) continue;
      const payload = line.slice(6);

      if (payload === '[DONE]') {
        console.log('\nStream finished');
        return;
      }

      const chunk = JSON.parse(payload);
      const choice = chunk.choices?.[0];
      const delta = choice?.delta;

      // Reasoning steps (thinking, tool use)
      if (delta?.reasoning_steps) {
        for (const step of delta.reasoning_steps) {
          if (step.type === 'thinking') {
            console.log('[thinking]', step.thought);
          } else {
            console.log(`[${step.type}]`, JSON.stringify(step));
          }
        }
      }

      // Content tokens (final answer)
      if (delta?.content) {
        process.stdout.write(delta.content);
      }

      // Finish
      if (choice?.finish_reason) {
        console.log(`\nFinish reason: ${choice.finish_reason}`);
        if (chunk.usage) {
          console.log('Usage:', JSON.stringify(chunk.usage));
        }
      }
    }
  }
}

Best Practices

Buffer incomplete lines — SSE data may arrive in partial TCP chunks
Skip SSE comment lines (starting with ":") — these are heartbeats
Check finish_reason in each chunk — "error" means the workflow failed, and the chunk includes an error object
Read usage from the final chunk to track token consumption and costs
Handle client disconnects gracefully — the server will automatically cancel the workflow if the connection drops

Next Steps

See the full API reference for request parameters, response schema, and reasoning step types.

Chat Completions API Reference