--- /dev/null
+Frame step
+----------
+
+This document is a draft that lists some ideas and purely informational.
+
+This document outlines the details of the frame stepping functionality in
+GStreamer.
+
+The stepping functionality operates on the current playback segment, position
+and rate as it was configured with a regular seek event. In contrast to the seek
+event, it operates very closely to the sink and thus has a very low latency and
+is not slowed down by queues and does not actually perform any seeking logic.
+For this reason we want to include a new API instead of reusing the seek API.
+
+The following requirements are needed:
+
+ - The ability to walk forwards and backwards in the stream.
+ - Arbitrary increments in any supported format (time, frames, bytes ...)
+ - High speed, minimal overhead. This mechanism is not more expensive than
+ simple playback.
+ - swithing between forwards and backwards stepping should be fast.
+ - Maintain synchronisation between streams.
+ - Get feedback of the amount of skipped data.
+ - Ability to play a certain amount of data at an arbitrary speed.
+
+We want a system where we can step frames in PAUSED as well as play short
+segments of data in PLAYING.
+
+
+Use Cases
+---------
+
+ * frame stepping in video only pipeline in PAUSED
+
+ .-----. .-------. .------. .-------.
+ | src | | demux | .-----. | vdec | | vsink |
+ | src->sink src1->|queue|->sink src->sink |
+ '-----' '-------' '-----' '------' '-------'
+
+ - app sets the pipeline to PAUSED to block on the preroll picture
+ - app seeks to required position in the stream. This can be done with a
+ positive or negative rate depending on the required frame stepping
+ direction.
+ - app steps frames (in GST_FORMAT_DEFAULT or GST_FORMAT_BUFFER). The
+ pipeline loses its PAUSED state until the required number of frames have
+ been skipped, it then prerolls again. This skipping is purely done in
+ the sink.
+ - sink posts STEP_DONE with amount of frames stepped and corresponding time
+ interval.
+
+ * frame stepping in audio/video pipeline in PAUSED
+
+ .-----. .-------. .------. .-------.
+ | src | | demux | .-----. | vdec | | vsink |
+ | src->sink src1->|queue|->sink src->sink |
+ '-----' | | '-----' '------' '-------'
+ | | .------. .-------.
+ | | .-----. | adec | | asink |
+ | src2->|queue|->sink src->sink |
+ '-------' '-----' '------' '-------'
+
+
+ - app sets the pipeline to PAUSED to block on the preroll picture
+ - app seeks to required position in the stream. This can be done with a
+ positive or negative rate depending on the required frame stepping
+ direction.
+ - app steps frames (in GST_FORMAT_DEFAULT or GST_FORMAT_BUFFER) or an amount
+ of time on the video sink. The pipeline loses its PAUSED state until the
+ required number of frames have been skipped, it then prerolls again.
+ This skipping is purely done in the sink.
+ - sink posts STEP_DONE with amount of frames stepped and corresponding time
+ interval.
+ - the app skips the same amount of time on the audiosink to align the
+ streams again. When huge amount of video frames are skipped, there needs
+ to be enough queueing in the pipeline to compensate for the accumulated
+ audio.
+
+ * frame stepping in audio/video pipeline in PLAYING
+
+ - app sets the pipeline to PAUSED to block on the preroll picture
+ - app seeks to required position in the stream. This can be done with a
+ positive or negative rate depending on the required frame stepping
+ direction.
+ - app configures frames steps (in GST_FORMAT_DEFAULT or GST_FORMAT_BUFFER) or
+ an amount of time on the sink. The step event has a flag indicating live
+ stepping so that the stepping will only happens in PLAYING.
+ - app sets pipeline to PLAYING. The pipeline continues PLAYING until it
+ consumed the amount of time.
+ - sink posts STEP_DONE with amount of frames stepped and corresponding time
+ interval. The sink will then wait for another step event. Since the
+ STEP_DONE message was emited by the sink when it handed off the buffer to
+ the device, there is usually sufficient time to queue a new STEP event so
+ that one can seamlessly continue stepping.
+
+
+events
+------
+
+ A new GST_EVENT_STEP event is introduced to start the step operation.
+ The step event is created with the following fields in the structure:
+
+ "format", GST_TYPE_FORMAT
+ The format of the step units
+
+ "amount", G_TYPE_UINT64
+ The amount of units to step. -1 resumes normal non-stepping behaviour to
+ the end of the segment.
+
+ "rate", G_TYPE_DOUBLE
+ The rate and direction at which the frames should be stepped in PLAYING
+ mode. 1.0 is the normal playback speed and direction of the segment, 2.0
+ is double speed. A negative rate will step backwards. A speed of 0.0 is
+ not allowed. When performing a flushing step, the speed is not relevant.
+
+ "flush", G_TYPE_BOOLEAN
+ when flushing is TRUE, the step is performed immediatly:
+
+ - In the PAUSED state the pipeline loses the PAUSED state, the requested
+ amount of data is skipped and the pipeline prerolls again. When the
+ pipeline was stepping while the event is sent, the current step
+ operation is updated with the new amount and format. The sink will do a
+ best effort to comply with the new amount.
+ - In the PLAYING state, the requested amount of data is skipped (not
+ rendered) from the previous STEP request or from the position of the
+ last PAUSED if no previous STEP operation was performed.
+
+ When flushing is FALSE, the step will be performed later.
+
+ - In the PAUSED state the step will be done when going to PLAYING. Any
+ previous step operation will be overridden with the new STEP event.
+ - In the PLAYING state the step operation will be performed after the
+ current step operation completes. If there was no previous step
+ operation, the step operation will be performed from the position of the
+ last PAUSED state.
+
+ The application will create a STEP event to start or stop the stepping
+ operation. Both stepping in PAUSED and PLAYING can be performed by means of
+ the flush flag.
+
+ The event is usually sent to the pipeline, which will typically distribute the
+ event to all of its sinks. For some use cases, like frame stepping on video
+ frames only, the event should only be sent to the video sink and upon reception
+ of the STEP_DONE message, one can step the other sinks to align the streams
+ again.
+
+ Since the step event does not update the base_time of any of the elements, the
+ sinks should keep track of the amount of stepped data in order to remain
+ synchronized against the clock.
+
+
+messages
+--------
+
+ A new GST_MESSAGE_STEP_DONE message is created. It contains the following
+ fields:
+
+ "format", GST_TYPE_FORMAT
+ The format of the step units that completed.
+
+ "amount", G_TYPE_UINT64
+ The amount of units that were stepped.
+
+ "rate", G_TYPE_DOUBLE
+ The rate and direction at which the frames were stepped.
+
+ "duration", G_TYPE_UINT64
+ The total duration of the stepped units in GST_FORMAT_TIME.
+
+ The message is emited by the element that performs the step operation.
+
+
+Direction switch
+----------------
+
+ When quickly switching between a forwards and a backwards step of, for example,
+ one video frame, we need either:
+
+ a) issue a new seek to change the direction from the current position.
+ b) cache a certain number of stepped frames and walk the cache.
+
+ option a) might be very slow.
+ For option b) we would ideally like to offload this caching functionality to a
+ separate element, which means that we need to forward the STEP event upstream.
+ It's unclear how this could work in a generic way. What is a demuxer supposed
+ to do when it received a step event? a flushing seek to what stream position?
+
+
------------------
Transform elements transform input buffers to output buffers based
-on the sink and source caps.
+on the sink and source caps.
-typical transform elements include:
+An important requirement for a transform is that the ouput caps are completely
+defined by the input caps and vice versa. This means that a typical decoder
+element can NOT be implemented with a transform element, this is because the
+output caps like width and height of the decompessed video frame, for example,
+are endcoded in the stream and thus not defined by the input caps.
- - audio convertors (audioconvert, ...)
- - video convertors (colorspace, videoscale, audioconvert, ...)
- - filters (capfilter, colorbalance,
+Typical transform elements include:
+
+ - audio convertors (audioconvert, audioresample,...)
+ - video convertors (colorspace, videoscale, ...)
+ - filters (capfilter, volume, colorbalance, ...)
The implementation of the transform element has to take care of
the following things:
Some transform elements can operate in different modes:
- - passthrough (no changes to buffers)
- - in-place (changes made to incomming buffer)
+ - passthrough (no changes are done on the input buffers)
+ - in-place (changes made directly to the incomming buffers without requiring a
+ copy or new buffer allocation)
- metadata changes only
Depending on the mode of operation the buffer allocation strategy might change.
+The transform element should at any point be able to renegotiate sink and src
+caps as well as change the operation mode.
+
+In addition, the transform element will typically take care of the following
+things as well:
+
+ - flushing, seeking
+ - state changes
+ - timestamping, this is typically done by copying the input timestamps to the
+ output buffers but subclasses should be able to override this.
+ - QoS, avoiding calls to the subclass transform function
+ - handle scheduling issues such as push and pull based operation.
+
+In the next sections, we will describe the behaviour of the transform element in
+each of the above use cases. We focus mostly on the buffer allocation strategies
+and caps negotiation.
+
+Processing
+----------
+
+A transform has 2 main processing functions:
+
+ - transform():
+
+ Transform the input buffer to the output buffer. The output buffer is
+ guaranteed to be writable and different from the input buffer.
+
+ - transform_ip():
+
+ Transform the input buffer in-place. The input buffer is writable and of
+ bigger or equal size than the output buffer.
+
+A transform can operate in the following modes:
+
+ - passthrough:
+
+ The element will not make changes to the buffers, buffers are pushed straight
+ through, caps on both sides need to be the same. The element can optionally
+ implement a transform_ip() function to take a look at the data, the buffer
+ does not have to be writable.
+
+ - in-place:
+
+ Changes can be made to the input buffer directly to obtain the output buffer.
+ The transform must implement a transform_ip() function.
+
+ - copy-transform
+
+ The transform is performed by copying and transforming the input buffer to a
+ new output buffer. The transform must implement a transform() function.
+
+When no transform() function is provided, only in-place and passthrough
+operation is allowed, this means that source and destination caps must be equal
+or that the source buffer size is bigger or equal than the destination buffer.
+
+When no transform_ip() function is provided, only passthrough and
+copy-transforms are supported. Providing this function is an optimisation that
+can avoid a buffer copy.
+
+When no functions are provided, we can only process in passthrough mode.
+
Negotiation
-----------
-The transform element is configured to perform a specific transform in these
-two situations:
+ Typical (re)negotiation of the transform element in push mode always goes from
+ sink to src, this means triggers the following sequence:
+
+ - the sinkpad receives a buffer with new caps, this triggers the setcaps
+ function on the sinkpad before handing the buffer to transform.
+ - the transform function figures out what it can convert these caps to.
+ - try to see if we can configure the caps unmodified on the peer. We need to
+ do this because we prefer to not do anything.
+ - the transform configures itself to transform from the new sink caps to the
+ target src caps
+ - the transform processes and sets the output caps on the src pad
+
+ We call this downstream negotiation (DN) and it goes roughly like this:
+
+ sinkpad transform srcpad
+ setcaps() | | |
+ ------------>| find_transform() | |
+ |------------------->| |
+ | | setcaps() |
+ | |--------------------->|
+ | <configure caps> <-| |
+
+
+ These steps configure the element for a transformation from the input caps to
+ the output caps.
+
+ The transform has 3 function to perform the negotiation:
+
+ - transform_caps():
+
+ Transform the caps on a certain pad to all the possible supported caps on
+ the other pad. The input caps are guaranteed to be a simple caps with just
+ one structure. The caps do not have to be fixed.
+
+ - fixate_caps():
+
+ Given a caps on one pad, fixate the caps on the other pad. The target caps
+ are writable.
+
+ - set_caps():
+
+ Configure the transform for a transformation between src caps and dest
+ caps. Both caps are guaranteed to be fixed caps.
+
+ If no transform_caps() is defined, we can only perform the identity transform,
+ by default.
+
+ If no set_caps() is defined, we don't care about caps. In that case we also
+ assume nothing is going to write to the buffer and we don't enforce a writable
+ buffer for the transform_ip function, when present.
+
+ One common function that we need for the transform element is to find the best
+ transform from one format (src) to another (dest). Since the function is
+ bidirectional, we will use the src->dest negotiation. Some requirements of this
+ function are:
+
+ - has a fixed src caps
+ - finds a fixed dest caps that the transform element can transform to
+ - the dest caps are compatible and can be accepted by peer elements
+ - the transform function prefers to make src caps == dest caps
+ - the transform function can optionally fixate dest caps.
+
+ The find_transform() function goes like this:
+
+ - start from src aps, these caps are fixed.
+ - check if the caps are acceptable for us as src caps. This is usually
+ enforced by the padtemplate of the element.
+ - check if the caps are acceptable for the peer. If this is possible, we can
+ perform passthrough and make src == dest. This is performed by simply
+ calling gst_pad_peer_accept_caps().
+ - if the caps are not acceptable, we need to transform to something,
+ for each of the transformed caps retrieved with transform_caps():
+ - try to fixate the caps with fixate_caps()
+ - if the caps are fixated, check if the peer accepts them with
+ _peer_accept_caps(), if the peer accepts, we have found a dest caps.
+ - if we run out of caps, we fail to find a transform.
+ - if we found a destination caps, configure the transform with set_caps().
+
+ After this negotiation process, the transform element is usually in a steady
+ state. We can identify these steady states:
+
+ - src and sink pads both have the same caps. Note that when the caps are equal
+ on both pads, the input and output buffers automatically have the same size.
+ The element can operate on the buffers in the following ways: (Same caps, SC)
+
+ - passthrough: buffers are inspected but no metadata or buffer data
+ is changed. The input buffers don't need to be writable. The input
+ buffer is simply pushed out again without modifications. (SCP)
+
+ sinkpad transform srcpad
+ chain() | | |
+ ------------>| handle_buffer() | |
+ |------------------->| pad_push() |
+ | |--------------------->|
+ | | |
+
+ - in-place: buffers are modified in-place, this means that the input
+ buffer is modified to produce a new output buffer. This requires the
+ input buffer to be writable. If the input buffer is not writable, a new
+ buffer has to be allocated with pad-alloc. (SCI)
+
+ sinkpad transform srcpad
+ chain() | | |
+ ------------>| handle_buffer() | |
+ |------------------->| |
+ | | [!writable] |
+ | | pad-alloc() |
+ | |--------------------->|
+ | [caps-changed] .-| [caps-changed] |
+ | <reconfigure> | | setcaps() |
+ | '>|--------------------->|
+ | .-| |
+ | <transform_ip> | | |
+ | '>| |
+ | | pad_push() |
+ | |--------------------->|
+ | | |
+
+ - copy transform: a new output buffer is allocated with pad-alloc and data
+ from the input buffer is transformed into the output buffer. (SCC)
+
+ sinkpad transform srcpad
+ chain() | | |
+ ------------>| handle_buffer() | |
+ |------------------->| |
+ | | pad_alloc() |
+ | |--------------------->|
+ | [caps-changed] .-| [caps-changed] |
+ | <reconfigure> | | setcaps() |
+ | '>|--------------------->|
+ | .-| |
+ | <transform> | | |
+ | '>| |
+ | | pad_push() |
+ | |--------------------->|
+ | | |
+
+ - src and sink pads have different caps. The element can operate on the
+ buffers in the following way: (Different Caps, DC)
+
+ - in-place: input buffers are modified in-place. This means that the input
+ buffer has a size that is larger or equal to the output size. The input
+ buffer will be resized to the size of the output buffer. If the input
+ buffer is not writable or the output size is bigger than the input size,
+ we need to pad-alloc a new buffer. (DCI)
+
+ sinkpad transform srcpad
+ chain() | | |
+ ------------>| handle_buffer() | |
+ |------------------->| |
+ | | [!writable || !size] |
+ | | pad-alloc |
+ | |--------------------->|
+ | [caps-changed] .-| [caps-changed] |
+ | <reconfigure> | | setcaps() |
+ | '>|--------------------->|
+ | .-| |
+ | <transform_ip> | | |
+ | '>| |
+ | | pad_push() |
+ | |--------------------->|
+ | | |
+
+ - copy transform: a new output buffer is allocated and the data from the
+ input buffer is transformed into the output buffer. The flow is exactly
+ the same as the case with the same-caps negotiation. (DCC)
+
+ We can immeditatly observe that the copy transform states will need to
+ allocate a buffer from a downstream element using pad-alloc. When the transform
+ element is receiving a non-writable buffer in the in-place state, it will also
+ need to perform a pad-alloc. There is no reason why the passthrough state would
+ perform a pad-alloc. This is important because upstream re-negotiation can only
+ happen when the transform uses pad-alloc for all outgoing buffers.
+
+ This steady state changes when one of the following actions occur:
+
+ - the sink pad receives new caps, this triggers the above downstream
+ renegotation process, see above for the flow.
+ - the src pad is instructed to produce new caps because of new caps from
+ pad-alloc, this only happens when the transform calls pad-alloc on the
+ srcpad in order to produce a new output buffer.
+ - the transform element wants to renegotiate (because of changed properties,
+ for example). This essentially clears the current steady state and
+ triggers the downstream and upstream renegotiation process.
+
+ Parallel to the downstream negotiation process there is an upstream negotiation
+ process. The handling and proxy of buffer-alloc is the most comple part of the
+ transform element. This upstream negotiation process has 3 cases: (UN)
+
+ - upstream calls the buffer-alloc function of the transform sinkpad and this
+ call is proxied downstream (UNP)
+ - upstream calls the buffer-alloc function of the transform sinkpad, the
+ transform does not proxy the call but returns a buffer itself (UNU)
+ - the transform calls the pad-alloc function downstream to allocate a new
+ output buffer (but not because of a proxied buffer-alloc) (UNA)
+
+ The case where the pad-alloc is called because an output buffer must be
+ generated in the chain function is handled above in the copy-transform and the
+ in-place transform when the input buffer is not writable or the input buffer
+ size is smaller than the output size.
+
+ We are left with the last case (proxy an incomming pad-alloc or not). We have 2
+ possibilities here:
+
+ - pad-alloc is called with the same caps as are currently being handled by
+ the transform on the sinkcaps. Note that this will only be true when the
+ transform element is completely negotiated because of data processing, see
+ above. Then the element is not yet negotiated, we proceed with the case
+ where sinkcaps are different from thos in the buffer-alloc.
+
+ * If the transform is using copy-transform, we don't need to proxy because
+ we will call pad-alloc when generating an output buffer.
+
+ sinkpad transform srcpad
+ buffer_alloc() | | |
+ --------------->| | |
+ | | |
+ |-. [same caps && | |
+ return default | | copy-trans] | |
+ <------------|<' | |
+ | | |
+
+ * If the transform is using in-place and insize < outsize, we proxy
+ the pad-alloc with the srccaps. If the caps are unmodified, we proxy
+ the buffer after changing the caps and size.
+
+ sinkpad transform srcpad
+ buffer_alloc() | | |
+ --------------->| | |
+ | [same caps && | |
+ | in-place] | |
+ |------------------->| pad_alloc() |
+ | |--------------------->|
+ | [caps unchanged] | |
+ return | adjust_buffer | |
+ <----------------------------------| |
+ | | |
+ | | |
+
+ * If the transform is using in-place and insize < outsize, we proxy
+ the pad-alloc with the srccaps. If the caps are modified find the best
+ transform from these new caps and return a buffer of this size/caps
+ instead.
+
+ sinkpad transform srcpad
+ buffer_alloc() | | |
+ --------------->| | |
+ | [same caps && | |
+ | in-place] | pad-alloc() |
+ |------------------------------------------>|
+ | [caps changed] .-| |
+ | find_transform() | | |
+ return | '>| |
+ <----------------------------------| |
+ | | |
+
+ * If the transform is using in-place and insize >= outsize, we cannot proxy
+ the pad-alloc because the resulting buffer would be too small to return
+ anyway.
+
+ * If the transform is using passthrough, we can proxy the pad-alloc to the
+ source pad. If the caps change, find the best transform and return a
+ buffer of those caps and size instead.
+
+ sinkpad transform srcpad
+ buffer_alloc() | | |
+ --------------->| [same caps && | |
+ | passtrough] | pad-alloc() |
+ |------------------------------------------>|
+ | [caps changed] .-| |
+ | find_transform() | | |
+ return | '>| |
+ <----------------------------------| |
+ | | |
+
+ - pad-alloc is called with different caps than are currently being handled by
+ the transform on the sinkcaps we have to try to negotiate a new
+ configuration for the transform element.
+
+ * we perform the standard way to finding a best transform using
+ find_transform() and we call the pad-alloc function with these caps.
+ If we get different caps from pad-alloc, we find the best format to
+ transform these to and return those caps instead.
- - new caps are received on the sink pad.
- - new caps are received on the source pad when allocating an output buffer and
- we can transform to these caps with the current input buffer.
+ sinkpad transform srcpad
+ buffer_alloc() | | |
+ --------------->| | |
+ | find_transform() | |
+ |------------------->| |
+ | | pad-alloc() |
+ | |--------------------->|
+ return | [caps unchanged] | |
+ <----------------------------------| |
+ | | |
+ | [caps changed] .-| |
+ | find_transform() | | |
+ return | '>| |
+ <----------------------------------| |
+ | | |