TL;DR / The Fetch API exposes response bodies as ReadableStreams, enabling incremental processing of data as it arrives rather than buffering the entire response in memory.
How It Works
┌───────────┐ ┌────────────┐
│ fetch() │────────→│ Response │
└───────────┘ └────────────┘
│
┌────┘
↓
┌────────────────┐
│ .body │
└────────────────┘
│
┌┘
↓
┌────────────────────────┐
│ ReadableStream │
└────────────────────────┘
│
│
↓
┌────────────────────────┐
│ getReader() │
└────────────────────────┘
│
│
↓
┌────────────────────────────────────────┐
│ read() -> {value, done} │ loop until done
└────────────────────────────────────────┘
When fetch() resolves, the Response object's .body property is a ReadableStream of Uint8Array chunks. The convenience methods .json(), .text(), and .blob() consume this stream entirely into memory. Streaming access bypasses this buffering, letting you process data chunk by chunk as it arrives from the network.
To read the stream manually, acquire a reader with response.body.getReader(). This locks the stream -- no other consumer can read from it simultaneously. Call reader.read() in a loop; each call returns a promise resolving to { value: Uint8Array, done: boolean }. When done is true, the stream is exhausted. The value chunks are raw bytes -- for text content, decode them through a TextDecoder instance (using the stream: true option to handle multi-byte characters split across chunk boundaries).
NDJSON (newline-delimited JSON) streaming is a common pattern for server-sent structured data. The server sends one JSON object per line, and the client processes each line as it arrives. The implementation reads chunks, splits on newlines, handles partial lines spanning chunk boundaries (accumulate in a buffer until a complete line arrives), and parses each complete line as JSON. This enables progressive UI updates from a single HTTP request.
Server-Sent Events (SSE) streaming via fetch is an alternative to the EventSource API, offering more control. The response has Content-Type: text/event-stream, and you parse the SSE protocol (data:, event:, id: fields) from the stream yourself. The advantage over EventSource is that fetch supports custom headers (authorization tokens), POST requests, and AbortController cancellation.
Progress tracking becomes straightforward with streaming. The Content-Length header (available via response.headers.get('Content-Length')) gives the total size, and you accumulate value.byteLength from each chunk to calculate percentage complete. This works for downloads, file processing, and providing genuine progress bars rather than indeterminate spinners.
TransformStream enables in-flight processing without accumulation. Pipe the response body through a transform to decompress, decrypt, parse, or filter data incrementally: response.body.pipeThrough(new DecompressionStream('gzip')).pipeThrough(new TextDecoderStream()). Each transform stage processes chunks independently, and backpressure propagates through the entire pipeline.
Streaming into the DOM is where this pattern shines for perceived performance. An LLM chat response, for instance, can render tokens as they arrive rather than waiting for the complete response. Read chunks, decode to text, and append to the DOM in the read loop. The user sees content appearing progressively, dramatically improving time-to-first-meaningful-content.
For upload streaming, the Fetch API also supports ReadableStream as a request body (with some browser limitations). This enables streaming uploads from a file, canvas, or generated data without loading the entire payload into memory. The request must use HTTP/2 or later (HTTP/1.1 requires Content-Length upfront, which is unknown for a stream). You must set duplex: 'half' in the fetch options to signal a streaming body, even though modern browsers support reading the response before the upload completes.
The Response constructor also accepts a ReadableStream, enabling synthetic responses for Service Workers. Intercept a fetch event, create a transform pipeline, and return a new Response wrapping the transformed stream. The page receives the data incrementally as if it came from the network.
Memory management is the primary motivation. A 500MB file download processed via .blob() requires 500MB of memory. Streaming through a hash function, parser, or write-to-disk pipeline via the File System Access API keeps memory usage proportional to the chunk size (typically 64KB-1MB), not the file size.
Gotchas
response.bodycan only be consumed once -- callinggetReader()locks the stream; calling.json()after reading chunks will throw; useresponse.clone()before consuming if you need multiple readsTextDecodermust usestream: truefor chunked text -- without it, multi-byte UTF-8 characters split across chunks will produce replacement characters (garbage output at chunk boundaries)- Chunk sizes are not controllable -- the browser delivers chunks at whatever size the network/HTTP layer provides; do not assume fixed chunk sizes or line-aligned boundaries
- Streaming fetch does not work with opaque responses -- cross-origin requests without CORS headers return opaque responses where
bodyis null, even in no-cors mode - HTTP/1.1 servers may buffer the entire response -- streaming requires the server to flush chunks progressively; many frameworks buffer by default and need explicit flush calls