Demo · Module 27 · Interactive
Streaming tokens. Tool calls.
The UI you've used a thousand times.
Hi! I'm a streaming chat demo. Try one of the suggested prompts below or type your own.
What makes this UI feel responsive. Two patterns do most of the work. (1) Token-by-token streaming via Server-Sent Events or chunked transfer — the model emits one token at a time, and the UI appends each one as it arrives, with a blinking cursor giving the user something to watch. This converts a 5-second silent wait into an immediately-engaging experience. Time to first token (TTFT) matters more than total time; visible progress kills the perception of slowness. (2) Tool calls as a structured pause — when the model decides to use a tool (search, calculator, code interpreter), it emits a special token pattern that the UI catches and renders as a sub-bubble showing the tool name, arguments, and result, before the model continues. This makes the system's reasoning legible rather than mysterious. What this demo simulates: realistic streaming character delays (you control the speed), tool calls with synthetic results, throughput tracking — the same UX primitives you'd build with the OpenAI/Anthropic streaming APIs in production.