* encoded input.
*
* Audio:
- * For encoding, this is the number of "priming" samples added to the
- * beginning of the stream. The decoded output will be delayed by this
- * many samples relative to the input to the encoder. Note that this
- * field is purely informational and does not directly affect the pts
- * output by the encoder, which should always be based on the actual
- * presentation time, including any delay.
+ * For encoding, this is the number of "priming" samples added by the
+ * encoder to the beginning of the stream. The decoded output will be
+ * delayed by this many samples relative to the input to the encoder (or
+ * more, if the decoder adds its own padding).
+ * The timestamps on the output packets are adjusted by the encoder so
+ * that they always refer to the first sample of the data actually
+ * contained in the packet, including any added padding.
+ * E.g. if the timebase is 1/samplerate and the timestamp of the first
+ * input sample is 0, the timestamp of the first output packet will be
+ * -delay.
+ *
* For decoding, this is the number of samples the decoder needs to
* output before the decoder's output is valid. When seeking, you should
* start decoding this many samples prior to your desired seek point.