docs/design/part-negotiation.txt

   1 Negotiation
   2 -----------
   3
   4 Capabilities negotiation is the process of deciding on an adequate
   5 format for dataflow within a GStreamer pipeline. Ideally, negotiation
   6 (also known as "capsnego") transfers information from those parts of the
   7 pipeline that have information to those parts of the pipeline that are
   8 flexible, constrained by those parts of the pipeline that are not
   9 flexible.
  10
  11
  12 Basic rules
  13 ~~~~~~~~~~~
  14
  15 The simple rules must be followed:
  16
  17  1) downstream suggests formats
  18  2) upstream decides on format
  19
  20 There are 4 queries/events used in caps negotiation:
  21
  22  1) GST_QUERY_CAPS        : get possible formats
  23  2) GST_QUERY_ACCEPT_CAPS : check if format is possible
  24  3) GST_EVENT_CAPS        : configure format (downstream)
  25  4) GST_EVENT_RECONFIGURE : inform upstream of possibly new caps
  26
  27
  28 Queries
  29 -------
  30
  31 A pad can ask the peer pad for its supported GstCaps. It does this with
  32 the CAPS query. The list of supported caps can be used to choose an
  33 appropriate GstCaps for the data transfer.
  34
  35  (in) "filter", GST_TYPE_CAPS (default NULL)
  36        - a GstCaps to filter the results against
  37
  38  (out) "caps", GST_TYPE_CAPS (default NULL)
  39        - the result caps
  40
  41
  42 A pad can ask the peer pad if it supports a given caps. It does this with
  43 the ACCEPT_CAPS query.
  44
  45  (in) "caps", GST_TYPE_CAPS
  46        - a GstCaps to check
  47
  48  (out) "result", G_TYPE_BOOLEAN (default FALSE)
  49        - TRUE if the caps are accepted
  50
  51
  52 Events
  53 ~~~~~~
  54
  55 When a media format is negotiated, peer elements are notified of the GstCaps
  56 with the CAPS event. The caps must be fixed.
  57
  58     "caps", GST_TYPE_CAPS
  59        - the negotiated GstCaps
  60
  61
  62 Operation
  63 ~~~~~~~~~
  64
  65 GStreamer's two scheduling modes, push mode and pull mode, lend
  66 themselves to different mechanisms to achieve this goal. As it is more
  67 common we describe push mode negotiation first.
  68
  69
  70 Push-mode negotiation
  71 ~~~~~~~~~~~~~~~~~~~~~
  72
  73 Push-mode negotiation happens when elements want to push buffers and
  74 need to decide on the format. This is called downstream negotiation
  75 because the upstream element decides the format for the downstream
  76 element. This is the most common case.
  77
  78 Negotiation can also happen when a downstream element wants to receive
  79 another data format from an upstream element. This is called upstream
  80 negotiation.
  81
  82 The basics of negotiation are as follows:
  83
  84  - GstCaps (see part-caps.txt) are refcounted before they are pushed as
  85    an event to describe the contents of the following buffer.
  86
  87  - An element should reconfigure itself to the new format received as a CAPS
  88    event before processing the following buffers. If the data type in the
  89    caps event is not acceptable, the element should refuse the event. The
  90    element should also refuse the next buffers by returning an appropriate
  91    GST_FLOW_NOT_NEGOTIATED return value from the chain function.
  92
  93  - Downstream elements can request a format change of the stream by sending a
  94    RECONFIGURE event upstream. Upstream elements will renegotiate a new format
  95    when they receive a RECONFIGURE event.
  96
  97 The general flow for a source pad starting the negotiation.
  98
  99              src              sink
 100               |                 |
 101               |  querycaps?     |
 102               |---------------->|
 103               |     caps        |
 104  select caps  |< - - - - - - - -|
 105  from the     |                 |
 106  candidates   |                 |
 107               |                 |-.
 108               |  accepts?       | |
 109   type A      |---------------->| | optional
 110               |      yes        | |
 111               |< - - - - - - - -| |
 112               |                 |-'
 113               |  send_event()   |
 114  send CAPS    |---------------->| Receive type A, reconfigure to
 115  event A      |                 | process type A.
 116               |                 |
 117               |  push           |
 118  push buffer  |---------------->| Process buffer of type A
 119               |                 |
 120
 121  One possible implementation in pseudo code:
 122
 123  [element wants to create a buffer]
 124  if not format
 125    # see what we can do
 126    ourcaps = gst_pad_query_caps (srcpad)
 127    # see what the peer can do filtered against our caps
 128    candidates = gst_pad_peer_query_caps (srcpad, ourcaps)
 129
 130    foreach candidate in candidates
 131      # make sure the caps is fixed
 132      fixedcaps = gst_pad_fixate_caps (srcpad, candidate)
 133
 134      # see if the peer accepts it
 135      if gst_pad_peer_accept_caps (srcpad, fixedcaps)
 136        # store the caps as the negotiated caps, this will
 137        # call the setcaps function on the pad
 138        gst_pad_push_event (srcpad, gst_event_new_caps (fixedcaps))
 139        break
 140      endif
 141    done
 142  endif
 143
 144  #negotiate allocator/bufferpool with the ALLOCATION query
 145
 146  buffer = gst_buffer_new_allocate (NULL, size, 0);
 147  # fill buffer and push
 148
 149
 150 The general flow for a sink pad starting a renegotiation.
 151
 152              src              sink
 153               |                 |
 154               |  accepts?       |
 155               |<----------------| type B
 156               |      yes        |
 157               |- - - - - - - - >|-.
 158               |                 | | suggest B caps next
 159               |                 |<'
 160               |                 |
 161               |   push_event()  |
 162   mark      .-|<----------------| send RECONFIGURE event
 163  renegotiate| |                 |
 164             '>|                 |
 165               |  querycaps()    |
 166  renegotiate  |---------------->|
 167               |  suggest B      |
 168               |< - - - - - - - -|
 169               |                 |
 170               |  send_event()   |
 171  send CAPS    |---------------->| Receive type B, reconfigure to
 172  event B      |                 | process type B.
 173               |                 |
 174               |  push           |
 175  push buffer  |---------------->| Process buffer of type B
 176               |                 |
 177
 178
 179 Use case:
 180
 181
 182 videotestsrc ! xvimagesink
 183
 184   1) Who decides what format to use?
 185    - src pad always decides, by convention. sinkpad can suggest a format
 186      by putting it high in the caps query result GstCaps.
 187    - since the src decides, it can always choose something that it can do,
 188      so this step can only fail if the sinkpad stated it could accept
 189      something while later on it couldn't.
 190
 191   2) When does negotiation happen?
 192    - before srcpad does a push, it figures out a type as stated in 1), then
 193      it pushes a caps event with the type. The sink checks the media type and
 194      configures itself for this type.
 195    - the source then usually does an ALLOCATION query to negotiate a bufferpool
 196      with the sink. It then allocates a buffer from the pool and pushes it to
 197      the sink. since the sink accepted the caps, it can create a pool for the
 198      format.
 199    - since the sink stated in 1) it could accept the type, it will be able to
 200      handle it.
 201
 202   3) How can sink request another format?
 203    - sink asks if new format is possible for the source.
 204    - sink pushes RECONFIGURE event upstream
 205    - src receives the RECONFIGURE event and marks renegotiation
 206    - On the next buffer push, the source renegotiates the caps and the
 207      bufferpool. The sink will put the new new prefered format high in the list
 208      of caps it returns from its caps query.
 209
 210 videotestsrc ! queue ! xvimagesink
 211
 212   - queue proxies all accept and caps queries to the other peer pad.
 213   - queue proxies the bufferpool
 214   - queue proxies the RECONFIGURE event
 215   - queue stores CAPS event in the queue. This means that the queue can contain
 216     buffers with different types.
 217
 218
 219 Pull-mode negotiation
 220 ~~~~~~~~~~~~~~~~~~~~~
 221
 222 Rationale
 223 ^^^^^^^^^
 224
 225 A pipeline in pull mode has different negotiation needs than one
 226 activated in push mode. Push mode is optimized for two use cases:
 227
 228  * Playback of media files, in which the demuxers and the decoders are
 229    the points from which format information should disseminate to the
 230    rest of the pipeline; and
 231
 232  * Recording from live sources, in which users are accustomed to putting
 233    a capsfilter directly after the source element; thus the caps
 234    information flow proceeds from the user, through the potential caps
 235    of the source, to the sinks of the pipeline.
 236
 237 In contrast, pull mode has other typical use cases:
 238
 239  * Playback from a lossy source, such as RTP, in which more knowledge
 240    about the latency of the pipeline can increase quality; or
 241
 242  * Audio synthesis, in which audio APIs are tuned to producing only the
 243    necessary number of samples, typically driven by a hardware interrupt
 244    to fill a DMA buffer or a Jack[0] port buffer.
 245
 246  * Low-latency effects processing, whereby filters should be applied as
 247    data is transferred from a ring buffer to a sink instead of
 248    beforehand. For example, instead of using the internal alsasink
 249    ringbuffer thread in push-mode wavsrc ! volume ! alsasink, placing
 250    the volume inside the sound card writer thread via wavsrc !
 251    audioringbuffer ! volume ! alsasink.
 252
 253 [0] http://jackit.sf.net
 254
 255 The problem with pull mode is that the sink has to know the format in
 256 order to know how many bytes to pull via gst_pad_pull_range(). This
 257 means that before pulling, the sink must initiate negotation to decide
 258 on a format.
 259
 260 Recalling the principles of capsnego, whereby information must flow from
 261 those that have it to those that do not, we see that the two named use
 262 cases have different negotiation requirements:
 263
 264  * RTP and low-latency playback are both like the normal playback case,
 265    in which information flows downstream.
 266
 267  * In audio synthesis, the part of the pipeline that has the most
 268    information is the sink, constrained by the capabilities of the graph
 269    that feeds it. However the caps are not completely specified; at some
 270    point the user has to intervene to choose the sample rate, at least.
 271    This can be done externally to gstreamer, as in the jack elements, or
 272    internally via a capsfilter, as is customary with live sources.
 273
 274 Given that sinks potentially need the input of sources, as in the RTP
 275 case and at least as a filter in the synthesis case, there must be a
 276 negotiation phase before the pull thread is activated. Also, given the
 277 low latency offered by pull mode, we want to avoid capsnego from within
 278 the pulling thread, in case it causes us to miss our scheduling
 279 deadlines.
 280
 281 The pull thread is usually started in the PAUSED->PLAYING state change. We must
 282 be able to complete the negotiation before this state change happens.
 283
 284 The time to do capsnego, then, is after the SCHEDULING query has succeeded,
 285 but before the sink has spawned the pulling thread.
 286
 287
 288 Mechanism
 289 ^^^^^^^^^
 290
 291 The sink determines that the upstream elements support pull based scheduling by
 292 doing a SCHEDULING query.
 293
 294 The sink initiates the negotiation process by intersecting the results
 295 of gst_pad_query_caps() on its sink pad and its peer src pad. This is the
 296 operation performed by gst_pad_get_allowed_caps(). In the simple
 297 passthrough case, the peer pad's caps query should return the
 298 intersection of calling get_allowed_caps() on all of its sink pads. In
 299 this way the sink element knows the capabilities of the entire pipeline.
 300
 301 The sink element then fixates the resulting caps, if necessary,
 302 resulting in the flow caps.  From now on, the caps query of the sinkpad
 303 will only return these fixed caps meaning that upstream elements
 304 will only be able to produce this format.
 305
 306 If the sink element could not set caps on its sink pad, it should post
 307 an error message on the bus indicating that negotiation was not
 308 possible.
 309
 310 When negotiation succeeded, the sinkpad and all upstream internally linked pads
 311 are activated in pull mode. Typically, this operation will trigger negotiation
 312 on the downstream elements, which will now be forced to negotiation to the
 313 final fixed desired caps of the sinkpad.
 314
 315 After these steps, the sink element returns ASYNC from the state change
 316 function. The state will commit to PAUSED when the first buffer is received in
 317 the sink. This is needed to provide a consistent API to the applications that
 318 expect ASYNC return values from sinks but it also allows us to perform the
 319 remainder of the negotiation outside of the context of the pulling thread.
 320
 321
 322 Patterns
 323 ~~~~~~~~
 324
 325 We can identify 3 patterns in negotiation:
 326
 327  1) Fixed : Can't choose the output format
 328       - Caps encoded in the stream
 329       - A video/audio decoder
 330       - usually uses gst_pad_use_fixed_caps()
 331
 332  2) Passthrough
 333       - Caps not modified
 334       - can do caps transform based on element property
 335       - videobox
 336
 337  3) Dynamic : can choose output format
 338       - A converter element
 339       - depends on downstream caps, needs to do a CAPS query to find
 340         transform.
 341       - usually prefers to use the identity transform