From 20a88feb8fe9a52382866166e897ddb5bfae199b Mon Sep 17 00:00:00 2001 From: isaacs Date: Thu, 13 Dec 2012 11:15:49 -0800 Subject: [PATCH] docs: streams2 --- doc/api/stream.markdown | 489 ++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 394 insertions(+), 95 deletions(-) diff --git a/doc/api/stream.markdown b/doc/api/stream.markdown index 5e0c6f6..974c5dd 100644 --- a/doc/api/stream.markdown +++ b/doc/api/stream.markdown @@ -7,186 +7,485 @@ Node. For example a request to an HTTP server is a stream, as is stdout. Streams are readable, writable, or both. All streams are instances of [EventEmitter][] -You can load up the Stream base class by doing `require('stream')`. +You can load the Stream base classes by doing `require('stream')`. +There are base classes provided for Readable streams, Writable +streams, Duplex streams, and Transform streams. -## Readable Stream +## Compatibility + +In earlier versions of Node, the Readable stream interface was +simpler, but also less powerful and less useful. + +* Rather than waiting for you to call the `read()` method, `'data'` + events would start emitting immediately. If you needed to do some + I/O to decide how to handle data, then you had to store the chunks + in some kind of buffer so that they would not be lost. +* The `pause()` method was advisory, rather than guaranteed. This + meant that you still had to be prepared to receive `'data'` events + even when the stream was in a paused state. + +In Node v0.10, the Readable class described below was added. For +backwards compatibility with older Node programs, Readable streams +switch into "old mode" when a `'data'` event handler is added, or when +the `pause()` or `resume()` methods are called. The effect is that, +even if you are not using the new `read()` method and `'readable'` +event, you no longer have to worry about losing `'data'` chunks. + +Most programs will continue to function normally. However, this +introduces an edge case in the following conditions: + +* No `'data'` event handler is added. +* The `pause()` and `resume()` methods are never called. + +For example, consider the following code: + +```javascript +// WARNING! BROKEN! +net.createServer(function(socket) { + + // we add an 'end' method, but never consume the data + socket.on('end', function() { + // It will never get here. + socket.end('I got your message (but didnt read it)\n'); + }); + +}).listen(1337); +``` + +In versions of node prior to v0.10, the incoming message data would be +simply discarded. However, in Node v0.10 and beyond, the socket will +remain paused forever. + +The workaround in this situation is to call the `resume()` method to +trigger "old mode" behavior: + +```javascript +// Workaround +net.createServer(function(socket) { + + socket.on('end', function() { + socket.end('I got your message (but didnt read it)\n'); + }); + + // start the flow of data, discarding it. + socket.resume(); + +}).listen(1337); +``` + +In addition to new Readable streams switching into old-mode, pre-v0.10 +style streams can be wrapped in a Readable class using the `wrap()` +method. + +## Class: stream.Readable A `Readable Stream` has the following methods, members, and events. -### Event: 'data' +Note that `stream.Readable` is an abstract class designed to be +extended with an underlying implementation of the `_read(size, cb)` +method. (See below.) -`function (data) { }` +### new stream.Readable([options]) -The `'data'` event emits either a `Buffer` (by default) or a string if -`setEncoding()` was used. +* `options` {Object} + * `bufferSize` {Number} The size of the chunks to consume from the + underlying resource. Default=16kb + * `lowWaterMark` {Number} The minimum number of bytes to store in + the internal buffer before emitting `readable`. Default=0 + * `highWaterMark` {Number} The maximum number of bytes to store in + the internal buffer before ceasing to read from the underlying + resource. Default=16kb + * `encoding` {String} If specified, then buffers will be decoded to + strings using the specified encoding. Default=null -Note that the __data will be lost__ if there is no listener when a -`Readable Stream` emits a `'data'` event. +In classes that extend the Readable class, make sure to call the +constructor so that the buffering settings can be properly +initialized. -### Event: 'end' +### readable.\_read(size, callback) + +* `size` {Number} Number of bytes to read asynchronously +* `callback` {Function} Called with an error or with data + +All Readable stream implementations must provide a `_read` method +to fetch data from the underlying resource. + +**This function MUST NOT be called directly.** It should be +implemented by child classes, and called by the internal Readable +class methods only. + +Call the callback using the standard `callback(error, data)` pattern. +When no more data can be fetched, call `callback(null, null)` to +signal the EOF. -`function () { }` +This method is prefixed with an underscore because it is internal to +the class that defines it, and should not be called directly by user +programs. However, you **are** expected to override this method in +your own extension classes. + + +### readable.wrap(stream) + +* `stream` {Stream} An "old style" readable stream + +If you are using an older Node library that emits `'data'` events and +has a `pause()` method that is advisory only, then you can use the +`wrap()` method to create a Readable stream that uses the old stream +as its data source. + +For example: + +```javascript +var OldReader = require('./old-api-module.js').OldReader; +var oreader = new OldReader; +var Readable = require('stream').Readable; +var myReader = new Readable().wrap(oreader); + +myReader.on('readable', function() { + myReader.read(); // etc. +}); +``` + +### Event: 'readable' + +When there is data ready to be consumed, this event will fire. The +number of bytes that are required to be considered "readable" depends +on the `lowWaterMark` option set in the constructor. + +When this event emits, call the `read()` method to consume the data. + +### Event: 'end' Emitted when the stream has received an EOF (FIN in TCP terminology). Indicates that no more `'data'` events will happen. If the stream is also writable, it may be possible to continue writing. -### Event: 'error' +### Event: 'data' + +The `'data'` event emits either a `Buffer` (by default) or a string if +`setEncoding()` was used. + +Note that adding a `'data'` event listener will switch the Readable +stream into "old mode", where data is emitted as soon as it is +available, rather than waiting for you to call `read()` to consume it. -`function (exception) { }` +### Event: 'error' Emitted if there was an error receiving data. ### Event: 'close' -`function () { }` - Emitted when the underlying resource (for example, the backing file descriptor) has been closed. Not all streams will emit this. -### stream.readable - -A boolean that is `true` by default, but turns `false` after an -`'error'` occurred, the stream came to an `'end'`, or `destroy()` was -called. - -### stream.setEncoding([encoding]) +### readable.setEncoding(encoding) Makes the `'data'` event emit a string instead of a `Buffer`. `encoding` -can be `'utf8'`, `'utf16le'` (`'ucs2'`), `'ascii'`, or `'hex'`. Defaults -to `'utf8'`. - -### stream.pause() +can be `'utf8'`, `'utf16le'` (`'ucs2'`), `'ascii'`, or `'hex'`. -Issues an advisory signal to the underlying communication layer, -requesting that no further data be sent until `resume()` is called. +The encoding can also be set by specifying an `encoding` field to the +constructor. -Note that, due to the advisory nature, certain streams will not be -paused immediately, and so `'data'` events may be emitted for some -indeterminate period of time even after `pause()` is called. You may -wish to buffer such `'data'` events. +### readable.read([size]) -### stream.resume() +* `size` {Number | null} Optional number of bytes to read. +* Return: {Buffer | String | null} -Resumes the incoming `'data'` events after a `pause()`. +Call this method to consume data once the `'readable'` event is +emitted. -### stream.destroy() +The `size` argument will set a minimum number of bytes that you are +interested in. If not set, then the entire content of the internal +buffer is returned. -Closes the underlying file descriptor. Stream is no longer `writable` -nor `readable`. The stream will not emit any more 'data', or 'end' -events. Any queued write data will not be sent. The stream should emit -'close' event once its resources have been disposed of. +If there is no data to consume, or if there are fewer bytes in the +internal buffer than the `size` argument, then `null` is returned, and +a future `'readable'` event will be emitted when more is available. +Note that calling `stream.read(0)` will always return `null`, and will +trigger a refresh of the internal buffer, but otherwise be a no-op. -### stream.pipe(destination, [options]) +### readable.pipe(destination, [options]) -This is a `Stream.prototype` method available on all `Stream`s. +* `destination` {Writable Stream} +* `options` {Object} Optional + * `end` {Boolean} Default=true -Connects this read stream to `destination` WriteStream. Incoming data on -this stream gets written to `destination`. The destination and source -streams are kept in sync by pausing and resuming as necessary. +Connects this readable stream to `destination` WriteStream. Incoming +data on this stream gets written to `destination`. Properly manages +back-pressure so that a slow destination will not be overwhelmed by a +fast readable stream. This function returns the `destination` stream. -Emulating the Unix `cat` command: - - process.stdin.resume(); process.stdin.pipe(process.stdout); +For example, emulating the Unix `cat` command: + process.stdin.pipe(process.stdout); By default `end()` is called on the destination when the source stream emits `end`, so that `destination` is no longer writable. Pass `{ end: false }` as `options` to keep the destination stream open. -This keeps `process.stdout` open so that "Goodbye" can be written at the +This keeps `writer` open so that "Goodbye" can be written at the end. - process.stdin.resume(); + reader.pipe(writer, { end: false }); + reader.on("end", function() { + writer.end("Goodbye\n"); + }); + +Note that `process.stderr` and `process.stdout` are never closed until +the process exits, regardless of the specified options. + +### readable.unpipe([destination]) + +* `destination` {Writable Stream} Optional + +Undo a previously established `pipe()`. If no destination is +provided, then all previously established pipes are removed. + +### readable.pause() - process.stdin.pipe(process.stdout, { end: false }); +Switches the readable stream into "old mode", where data is emitted +using a `'data'` event rather than being buffered for consumption via +the `read()` method. - process.stdin.on("end", function() { - process.stdout.write("Goodbye\n"); }); +Ceases the flow of data. No `'data'` events are emitted while the +stream is in a paused state. +### readable.resume() -## Writable Stream +Switches the readable stream into "old mode", where data is emitted +using a `'data'` event rather than being buffered for consumption via +the `read()` method. + +Resumes the incoming `'data'` events after a `pause()`. + + +## Class: stream.Writable -A `Writable Stream` has the following methods, members, and events. +A `Writable` Stream has the following methods, members, and events. -### Event: 'drain' +Note that `stream.Writable` is an abstract class designed to be +extended with an underlying implementation of the `_write(chunk, cb)` +method. (See below.) -`function () { }` +### new stream.Writable([options]) -Emitted when the stream's write queue empties and it's safe to write without -buffering again. Listen for it when `stream.write()` returns `false`. +* `options` {Object} + * `highWaterMark` {Number} Buffer level when `write()` starts + returning false. Default=16kb + * `lowWaterMark` {Number} The buffer level when `'drain'` is + emitted. Default=0 + * `decodeStrings` {Boolean} Whether or not to decode strings into + Buffers before passing them to `_write()`. Default=true -The `'drain'` event can happen at *any* time, regardless of whether or not -`stream.write()` has previously returned `false`. To avoid receiving unwanted -`'drain'` events, listen using `stream.once()`. +In classes that extend the Writable class, make sure to call the +constructor so that the buffering settings can be properly +initialized. -### Event: 'error' +### writable.\_write(chunk, callback) -`function (exception) { }` +* `chunk` {Buffer | Array} The data to be written +* `callback` {Function} Called with an error, or null when finished -Emitted on error with the exception `exception`. +All Writable stream implementations must provide a `_write` method to +send data to the underlying resource. -### Event: 'close' +**This function MUST NOT be called directly.** It should be +implemented by child classes, and called by the internal Writable +class methods only. + +Call the callback using the standard `callback(error)` pattern to +signal that the write completed successfully or with an error. + +If the `decodeStrings` flag is set in the constructor options, then +`chunk` will be an array rather than a Buffer. This is to support +implementations that have an optimized handling for certain string +data encodings. + +This method is prefixed with an underscore because it is internal to +the class that defines it, and should not be called directly by user +programs. However, you **are** expected to override this method in +your own extension classes. + + +### writable.write(chunk, [encoding], [callback]) + +* `chunk` {Buffer | String} Data to be written +* `encoding` {String} Optional. If `chunk` is a string, then encoding + defaults to `'utf8'` +* `callback` {Function} Optional. Called when this chunk is + successfully written. +* Returns {Boolean} + +Writes `chunk` to the stream. Returns `true` if the data has been +flushed to the underlying resource. Returns `false` to indicate that +the buffer is full, and the data will be sent out in the future. The +`'drain'` event will indicate when the buffer is empty again. + +The specifics of when `write()` will return false, and when a +subsequent `'drain'` event will be emitted, are determined by the +`highWaterMark` and `lowWaterMark` options provided to the +constructor. + +### writable.end([chunk], [encoding]) + +* `chunk` {Buffer | String} Optional final data to be written +* `encoding` {String} Optional. If `chunk` is a string, then encoding + defaults to `'utf8'` + +Call this method to signal the end of the data being written to the +stream. -`function () { }` +### Event: 'drain' + +Emitted when the stream's write queue empties and it's safe to write +without buffering again. Listen for it when `stream.write()` returns +`false`. + +### Event: 'close' -Emitted when the underlying file descriptor has been closed. +Emitted when the underlying resource (for example, the backing file +descriptor) has been closed. Not all streams will emit this. ### Event: 'pipe' -`function (src) { }` +* `source` {Readable Stream} Emitted when the stream is passed to a readable stream's pipe method. -### stream.writable +### Event 'unpipe' + +* `source` {Readable Stream} + +Emitted when a previously established `pipe()` is removed using the +source Readable stream's `unpipe()` method. + +## Class: stream.Duplex + + + +A "duplex" stream is one that is both Readable and Writable, such as a +TCP socket connection. + +Note that `stream.Duplex` is an abstract class designed to be +extended with an underlying implementation of the `_read(size, cb)` +and `_write(chunk, callback)` methods as you would with a Readable or +Writable stream class. + +Since JavaScript doesn't have multiple prototypal inheritance, this +class prototypally inherits from Readable, and then parasitically from +Writable. It is thus up to the user to implement both the lowlevel +`_read(n,cb)` method as well as the lowlevel `_write(chunk,cb)` method +on extension duplex classes. + +### new stream.Duplex(options) + +* `options` {Object} Passed to both Writable and Readable + constructors. Also has the following fields: + * `allowHalfOpen` {Boolean} Default=true. If set to `false`, then + the stream will automatically end the readable side when the + writable side ends and vice versa. + +In classes that extend the Duplex class, make sure to call the +constructor so that the buffering settings can be properly +initialized. + +## Class: stream.Transform + +A "transform" stream is a duplex stream where the output is causally +connected in some way to the input, such as a zlib stream or a crypto +stream. + +There is no requirement that the output be the same size as the input, +the same number of chunks, or arrive at the same time. For example, a +Hash stream will only ever have a single chunk of output which is +provided when the input is ended. A zlib stream will either produce +much smaller or much larger than its input. + +Rather than implement the `_read()` and `_write()` methods, Transform +classes must implement the `_transform()` method, and may optionally +also implement the `_flush()` method. (See below.) + +### new stream.Transform([options]) + +* `options` {Object} Passed to both Writable and Readable + constructors. + +In classes that extend the Transform class, make sure to call the +constructor so that the buffering settings can be properly +initialized. + +### transform.\_transform(chunk, outputFn, callback) + +* `chunk` {Buffer} The chunk to be transformed. +* `outputFn` {Function} Call this function with any output data to be + passed to the readable interface. +* `callback` {Function} Call this function (optionally with an error + argument) when you are done processing the supplied chunk. -A boolean that is `true` by default, but turns `false` after an -`'error'` occurred or `end()` / `destroy()` was called. +All Transform stream implementations must provide a `_transform` +method to accept input and produce output. -### stream.write(string, [encoding]) +**This function MUST NOT be called directly.** It should be +implemented by child classes, and called by the internal Transform +class methods only. -Writes `string` with the given `encoding` to the stream. Returns `true` -if the string has been flushed to the kernel buffer. Returns `false` to -indicate that the kernel buffer is full, and the data will be sent out -in the future. The `'drain'` event will indicate when the kernel buffer -is empty again. The `encoding` defaults to `'utf8'`. +`_transform` should do whatever has to be done in this specific +Transform class, to handle the bytes being written, and pass them off +to the readable portion of the interface. Do asynchronous I/O, +process things, and so on. -### stream.write(buffer) +Call the callback function only when the current chunk is completely +consumed. Note that this may mean that you call the `outputFn` zero +or more times, depending on how much data you want to output as a +result of this chunk. -Same as the above except with a raw buffer. +This method is prefixed with an underscore because it is internal to +the class that defines it, and should not be called directly by user +programs. However, you **are** expected to override this method in +your own extension classes. -### stream.end() +### transform.\_flush(outputFn, callback) -Terminates the stream with EOF or FIN. This call will allow queued -write data to be sent before closing the stream. +* `outputFn` {Function} Call this function with any output data to be + passed to the readable interface. +* `callback` {Function} Call this function (optionally with an error + argument) when you are done flushing any remaining data. -### stream.end(string, encoding) +**This function MUST NOT be called directly.** It MAY be implemented +by child classes, and if so, will be called by the internal Transform +class methods only. -Sends `string` with the given `encoding` and terminates the stream with -EOF or FIN. This is useful to reduce the number of packets sent. +In some cases, your transform operation may need to emit a bit more +data at the end of the stream. For example, a `Zlib` compression +stream will store up some internal state so that it can optimally +compress the output. At the end, however, it needs to do the best it +can with what is left, so that the data will be complete. -### stream.end(buffer) +In those cases, you can implement a `_flush` method, which will be +called at the very end, after all the written data is consumed, but +before emitting `end` to signal the end of the readable side. Just +like with `_transform`, call `outputFn` zero or more times, as +appropriate, and call `callback` when the flush operation is complete. -Same as above but with a `buffer`. +This method is prefixed with an underscore because it is internal to +the class that defines it, and should not be called directly by user +programs. However, you **are** expected to override this method in +your own extension classes. -### stream.destroy() -Closes the underlying file descriptor. Stream is no longer `writable` -nor `readable`. The stream will not emit any more 'data', or 'end' -events. Any queued write data will not be sent. The stream should emit -'close' event once its resources have been disposed of. +## Class: stream.PassThrough -### stream.destroySoon() +This is a trivial implementation of a `Transform` stream that simply +passes the input bytes across to the output. Its purpose is mainly +for examples and testing, but there are occasionally use cases where +it can come in handy. -After the write queue is drained, close the file descriptor. -`destroySoon()` can still destroy straight away, as long as there is no -data left in the queue for writes. [EventEmitter]: events.html#events_class_events_eventemitter -- 2.7.4