Simplify LLM Streaming

Generator functions are very cool. But up until recently I hadn’t found a use case for them. That is, until I was working on streaming responses from an LLM. Streamed responses are great—they provide visual feedback to the user for what would otherwise be a very long delay. However, the code to handle the streamed response isn’t something you’d want to duplicate in all of your React components. By leveraging generator functions, it’s possible to extract the complicated bits and increase the readability of the response handlers.

Below is a basic example to illustrate the concept:

async function *streamResponse(response: Response): AsyncGenerator<string, void, unknown> {
    if (!response.ok) {
      throw new Error(`HTTP error! status: ${response.status}`);
    }

    if (!response.body) {
      throw new Error('No response body');
    }

    const reader = response.body.getReader();
    const decoder = new TextDecoder();

    try {
      while (true) {
        const { done, value } = await reader.read();
        if (done) break;

        const chunk = decoder.decode(value, { stream: true });
        const lines = chunk.split('\n');

        for (const line of lines) {
          if (line.trim() && line.startsWith('data: ')) {
            const data = line.slice(6);
            if (data === '[DONE]') return;

            try {
              const parsed = JSON.parse(data);
              const content = extractContent(parsed);
              if (content) yield content;
            } catch (e) {
              continue;
            }
          }
        }
      }
    } finally {
      reader.releaseLock();
    }
}

function extractContent(parsed: any): string | null {
    return parsed.choices?.[0]?.delta?.content ||
           parsed.content ||
           parsed.text ||
           null;
}

async function useStreamingLLM() {
  const payload = {
    model: 'your-model',
    messages: [{ role: 'user', content: 'Hello, world!' }],
    stream: true,
  };

  const response = await fetch(
    'https://api.your-llm-provider.com/v1/completions',
    {
        headers: { 'Authorization': 'Bearer your-api-key' },
        method: "POST",
        body: JSON.stringify(payload),
    }
  );

  for await (const chunk of streamResponse(response)) {
    console.log('Received chunk:', chunk);
    // setState(prev => prev + chunk)
  }
}

This could be rewritten to increase portability, and Zod could be added for additional safety.