4 Capabilities negotiation is the process of deciding on an adequate
5 format for dataflow within a GStreamer pipeline. Ideally, negotiation
6 (also known as "capsnego") transfers information from those parts of the
7 pipeline that have information to those parts of the pipeline that are
8 flexible, constrained by those parts of the pipeline that are not
15 The simple rules must be followed:
17 1) downstream suggests formats
18 2) upstream decides on format
20 There are 4 queries/events used in caps negotiation:
22 1) GST_QUERY_CAPS : get possible formats
23 2) GST_QUERY_ACCEPT_CAPS : check if format is possible
24 3) GST_EVENT_CAPS : configure format (downstream)
25 4) GST_EVENT_RECONFIGURE : inform upstream of possibly new caps
31 A pad can ask the peer pad for its supported GstCaps. It does this with
32 the CAPS query. The list of supported caps can be used to choose an
33 appropriate GstCaps for the data transfer.
35 (in) "filter", GST_TYPE_CAPS (default NULL)
36 - a GstCaps to filter the results against
38 (out) "caps", GST_TYPE_CAPS (default NULL)
42 A pad can ask the peer pad if it supports a given caps. It does this with
43 the ACCEPT_CAPS query.
45 (in) "caps", GST_TYPE_CAPS
48 (out) "result", G_TYPE_BOOLEAN (default FALSE)
49 - TRUE if the caps are accepted
55 When a media format is negotiated, peer elements are notified of the GstCaps
56 with the CAPS event. The caps must be fixed.
59 - the negotiated GstCaps
65 GStreamer's two scheduling modes, push mode and pull mode, lend
66 themselves to different mechanisms to achieve this goal. As it is more
67 common we describe push mode negotiation first.
73 Push-mode negotiation happens when elements want to push buffers and
74 need to decide on the format. This is called downstream negotiation
75 because the upstream element decides the format for the downstream
76 element. This is the most common case.
78 Negotiation can also happen when a downstream element wants to receive
79 another data format from an upstream element. This is called upstream
82 The basics of negotiation are as follows:
84 - GstCaps (see part-caps.txt) are refcounted before they are pushed as
85 an event to describe the contents of the following buffer.
87 - An element should reconfigure itself to the new format received as a CAPS
88 event before processing the following buffers. If the data type in the
89 caps event is not acceptable, the element should refuse the event. The
90 element should also refuse the next buffers by returning an appropriate
91 GST_FLOW_NOT_NEGOTIATED return value from the chain function.
93 - Downstream elements can request a format change of the stream by sending a
94 RECONFIGURE event upstream. Upstream elements will renegotiate a new format
95 when they receive a RECONFIGURE event.
97 The general flow for a source pad starting the negotiation.
104 select caps |< - - - - - - - -|
109 type A |---------------->| | optional
111 |< - - - - - - - -| |
114 send CAPS |---------------->| Receive type A, reconfigure to
115 event A | | process type A.
118 push buffer |---------------->| Process buffer of type A
121 One possible implementation in pseudo code:
123 [element wants to create a buffer]
126 ourcaps = gst_pad_query_caps (srcpad)
127 # see what the peer can do filtered against our caps
128 candidates = gst_pad_peer_query_caps (srcpad, ourcaps)
130 foreach candidate in candidates
131 # make sure the caps is fixed
132 fixedcaps = gst_pad_fixate_caps (srcpad, candidate)
134 # see if the peer accepts it
135 if gst_pad_peer_accept_caps (srcpad, fixedcaps)
136 # store the caps as the negotiated caps, this will
137 # call the setcaps function on the pad
138 gst_pad_push_event (srcpad, gst_event_new_caps (fixedcaps))
144 #negotiate allocator/bufferpool with the ALLOCATION query
146 buffer = gst_buffer_new_allocate (NULL, size, 0);
147 # fill buffer and push
150 The general flow for a sink pad starting a renegotiation.
155 |<----------------| type B
157 |- - - - - - - - >|-.
158 | | | suggest B caps next
162 mark .-|<----------------| send RECONFIGURE event
166 renegotiate |---------------->|
171 send CAPS |---------------->| Receive type B, reconfigure to
172 event B | | process type B.
175 push buffer |---------------->| Process buffer of type B
182 videotestsrc ! xvimagesink
184 1) Who decides what format to use?
185 - src pad always decides, by convention. sinkpad can suggest a format
186 by putting it high in the caps query result GstCaps.
187 - since the src decides, it can always choose something that it can do,
188 so this step can only fail if the sinkpad stated it could accept
189 something while later on it couldn't.
191 2) When does negotiation happen?
192 - before srcpad does a push, it figures out a type as stated in 1), then
193 it pushes a caps event with the type. The sink checks the media type and
194 configures itself for this type.
195 - the source then usually does an ALLOCATION query to negotiate a bufferpool
196 with the sink. It then allocates a buffer from the pool and pushes it to
197 the sink. since the sink accepted the caps, it can create a pool for the
199 - since the sink stated in 1) it could accept the type, it will be able to
202 3) How can sink request another format?
203 - sink asks if new format is possible for the source.
204 - sink pushes RECONFIGURE event upstream
205 - src receives the RECONFIGURE event and marks renegotiation
206 - On the next buffer push, the source renegotiates the caps and the
207 bufferpool. The sink will put the new new prefered format high in the list
208 of caps it returns from its caps query.
210 videotestsrc ! queue ! xvimagesink
212 - queue proxies all accept and caps queries to the other peer pad.
213 - queue proxies the bufferpool
214 - queue proxies the RECONFIGURE event
215 - queue stores CAPS event in the queue. This means that the queue can contain
216 buffers with different types.
219 Pull-mode negotiation
220 ~~~~~~~~~~~~~~~~~~~~~
225 A pipeline in pull mode has different negotiation needs than one
226 activated in push mode. Push mode is optimized for two use cases:
228 * Playback of media files, in which the demuxers and the decoders are
229 the points from which format information should disseminate to the
230 rest of the pipeline; and
232 * Recording from live sources, in which users are accustomed to putting
233 a capsfilter directly after the source element; thus the caps
234 information flow proceeds from the user, through the potential caps
235 of the source, to the sinks of the pipeline.
237 In contrast, pull mode has other typical use cases:
239 * Playback from a lossy source, such as RTP, in which more knowledge
240 about the latency of the pipeline can increase quality; or
242 * Audio synthesis, in which audio APIs are tuned to producing only the
243 necessary number of samples, typically driven by a hardware interrupt
244 to fill a DMA buffer or a Jack[0] port buffer.
246 * Low-latency effects processing, whereby filters should be applied as
247 data is transferred from a ring buffer to a sink instead of
248 beforehand. For example, instead of using the internal alsasink
249 ringbuffer thread in push-mode wavsrc ! volume ! alsasink, placing
250 the volume inside the sound card writer thread via wavsrc !
251 audioringbuffer ! volume ! alsasink.
253 [0] http://jackit.sf.net
255 The problem with pull mode is that the sink has to know the format in
256 order to know how many bytes to pull via gst_pad_pull_range(). This
257 means that before pulling, the sink must initiate negotation to decide
260 Recalling the principles of capsnego, whereby information must flow from
261 those that have it to those that do not, we see that the two named use
262 cases have different negotiation requirements:
264 * RTP and low-latency playback are both like the normal playback case,
265 in which information flows downstream.
267 * In audio synthesis, the part of the pipeline that has the most
268 information is the sink, constrained by the capabilities of the graph
269 that feeds it. However the caps are not completely specified; at some
270 point the user has to intervene to choose the sample rate, at least.
271 This can be done externally to gstreamer, as in the jack elements, or
272 internally via a capsfilter, as is customary with live sources.
274 Given that sinks potentially need the input of sources, as in the RTP
275 case and at least as a filter in the synthesis case, there must be a
276 negotiation phase before the pull thread is activated. Also, given the
277 low latency offered by pull mode, we want to avoid capsnego from within
278 the pulling thread, in case it causes us to miss our scheduling
281 The pull thread is usually started in the PAUSED->PLAYING state change. We must
282 be able to complete the negotiation before this state change happens.
284 The time to do capsnego, then, is after the SCHEDULING query has succeeded,
285 but before the sink has spawned the pulling thread.
291 The sink determines that the upstream elements support pull based scheduling by
292 doing a SCHEDULING query.
294 The sink initiates the negotiation process by intersecting the results
295 of gst_pad_query_caps() on its sink pad and its peer src pad. This is the
296 operation performed by gst_pad_get_allowed_caps(). In the simple
297 passthrough case, the peer pad's caps query should return the
298 intersection of calling get_allowed_caps() on all of its sink pads. In
299 this way the sink element knows the capabilities of the entire pipeline.
301 The sink element then fixates the resulting caps, if necessary,
302 resulting in the flow caps. From now on, the caps query of the sinkpad
303 will only return these fixed caps meaning that upstream elements
304 will only be able to produce this format.
306 If the sink element could not set caps on its sink pad, it should post
307 an error message on the bus indicating that negotiation was not
310 When negotiation succeeded, the sinkpad and all upstream internally linked pads
311 are activated in pull mode. Typically, this operation will trigger negotiation
312 on the downstream elements, which will now be forced to negotiation to the
313 final fixed desired caps of the sinkpad.
315 After these steps, the sink element returns ASYNC from the state change
316 function. The state will commit to PAUSED when the first buffer is received in
317 the sink. This is needed to provide a consistent API to the applications that
318 expect ASYNC return values from sinks but it also allows us to perform the
319 remainder of the negotiation outside of the context of the pulling thread.
325 We can identify 3 patterns in negotiation:
327 1) Fixed : Can't choose the output format
328 - Caps encoded in the stream
329 - A video/audio decoder
330 - usually uses gst_pad_use_fixed_caps()
334 - can do caps transform based on element property
337 3) Dynamic : can choose output format
338 - A converter element
339 - depends on downstream caps, needs to do a CAPS query to find
341 - usually prefers to use the identity transform