6 The following observations are made when considering the current (17/11/2004)
10 Currently the state of a bin is determined by the highest state of the
11 children, This is in particular a problem for GstThread because a thread
12 should start/stop spinning at any time depending on the state of a child.
16 +-------------------------------------+
18 | +--------+ +---------+ +------+ |
19 | | src | | decoder | | sink | |
20 | | src-sink src-sink | |
21 | +--------+ +---------+ +------+ |
22 +-------------------------------------+
24 When performing the state change on the GstThread to PLAYING, one of the
25 children (at random) will go to PLAYING first, this will trigger a method
26 in GstThread that will start spinning the thread. Some elements are not yet
27 in the PLAYING state when the scheduler starts iterating elements. This
28 is not a clean way to start the data passing.
30 State changes also trigger negotiation and scheduling (in the other thread)
31 can do too. This creates races in negotiation.
33 - ERROR and EOS conditions triggering a state change
35 A typical problem is also that since scheduling starts while the state change
36 happens, it is possible that the elements go to EOS or ERROR before the
37 state change completes. Currently this makes the elements go to PAUSED again,
38 creating races with the state change in progress. This also gives the
39 impression to the core that the state change failed.
41 - no locking whatsoever
43 When an element does a state change, it is possible for another thread to
44 perform a conflicting state change.
46 - negotiation is not designed to work over multithread boundaries.
48 negotiation over a queue is not possible. There is no method or policy of
49 discovering a media type and then commiting it. It is also not possible to
50 tie the negotiated media to the relevant buffer.
53 it Should be possible to queue the old and the new formats in a queue.
54 The element connected to the sinkpad of the queue should be able to
55 find out that the new format will be accepted by the element connected
56 on the srcpad of the queue, even if that element is streaming the old
59 +------------------------------+
61 | +++++++++++++++++++++++++ |
62 -sink |B|B|B|B|B|B|A|A|A|A|A|A| src-
63 | +++++++++++++++++++++++++ |
64 +------------------------------+
65 +----------+ +----------+
69 - element properties are not threadsafe
71 When setting an element property while streaming, the element does no
72 locking whatsoever to guarantee its internal consistency.
74 - No control over streaming.
76 When some GstThread is iterating and you want to reconnect a pad, there
77 is no way to block the pad, perform the actions and then unblock it
78 again. This leads to thread problems where a pad is negotiation at the
79 same time that it is passing data.
81 This is currently solved by PAUSING the pipeline or performing the actions
82 in the same threadcontext as the iterate loop.
84 - race conditions in synchronizing the clocks and spinning up the pipeline.
85 Currently the clock is started as soon as the pipeline is set to playing.
86 Because some time elaspes before the elements are negotiated, autoplugged
87 and streaming, the first frame/sample almost always arrives late at the
88 sinks. Hacks exist to adjust the element base time to compensate for the
89 delay but this clearly is not clean.
91 - race conditions when performing seeks in the pipeline. Since the elements
92 have no control over the streaming threads, they cannot block them or
93 resync them to the new seek position. It is also hard to synchronize them
96 - race conditions when sending tags and error messages up the pipeline
97 hierarchy. These races are either caused by glib refcounting problems and
98 by not properly locking.
100 - more as changes are implemented and testcases are written
102 2) possible solutions
104 - not allowing threading at all
106 Run the complete pipeline in a single thread. Complications to solve include
107 handling of blocking operations like source elements blocking in kernel
108 space, sink elements blocking on the clock or kernel space, etc.. In practice,
109 all operations should be made non-blocking because a blocking element can
110 cause the rest of the pipeline to block as well and cause it to miss a deadline.
111 A non-blocking model needs cooperation from the kernel (with callbacks) or
112 requires the use of a polling mechanism, both of which are either impractical
113 or too CPU intensive and in general not achievable for a general purpose
114 Multimedia framework. For this reason we will not go further with this
119 To make this work, We propose the following changes:
121 - Remove GstThread, it does not add anything useful in a sense that you cannot
122 arbitrarily place the thread element, it needs decoupled elements around the
125 - Simplify the state changes of bins elements. A bin or element never changes
126 state automatically on EOS and ERROR.
128 - Introduce the concept of the application and the streaming thread. All data
129 passing is done in the streaming thread. This also means that all operations
130 either are performed in the application thread or streaming thread and that
131 they should be protected against competing operations in other threads.
132 This would define a policy for adding appropriate locking.
134 - Move the creation of threads into source and loop-based elements. This will
135 make it possible for the elements in control of the threads to perform the
136 locking when needed. One particular instance is for example the state changes,
137 by creating the threads in the element, it is possible to sync the streaming
138 and the application thread (which does the state change).
140 - Remove negotiation from state changes. This will remove the conflict between
141 streaming and negotiating elements.
143 - add locks around pad operations like negotiation, streaming, linking, etc. This
144 will remove races between these conflicting operations. This will also make it
145 possible to un/block dataflow.
147 - add locks around bin operations like add/removing elements.
149 - add locks around element operations like state changes and property changes.
151 - add a 2-phase directed negotiation process. The source pad queries and figures
152 out what the sinkpad can take in the first phase. In the second phase it sends
153 the new format change as an event to the peer element. This event can be
154 interleaved with the buffers and can travel over queues inbetween the buffers.
155 Need to rethink this wrt bufferpools (see DShow and old bufferpool implementation)
157 - add a preroll phase that will be used to spin up the pipeline and align frames/samples
158 in the sinks. This phase will happen in the PAUSED state. This also means that
159 dataflow will happen in the PAUSED state. Sinks will not sink samples in the PAUSED
160 state but will complete their state change asynchronously. This will allow
161 us to have perfect synchronisation with the clock.
163 - a two phase seek policy. First the event travels upstream, putting all elements in
164 the seeking phase and making them synchronize to the new position. In the
165 second phase the DISCONT event signals the end of the seek and all filters can
166 continue with the new position.
168 - Error messages, EOS, tags and other events in the pipeline should be sent to a
169 mainloop. The app then has an in-thread mechanism for getting information about
170 the pipeline. It should also be possible to get the messages directly from the
171 elements itself, like signals. The application programmer has to know that
172 working these events come from another thread and should handle them accordingly.
174 - Add return values to push/pull so that errors upstream or downstream can be noted
175 by other elements so that they can disable themselves or propagate the error.
178 3) detailed explanation
180 a) Pipeline construction
182 Pipeline construction includes:
184 - adding/removing elements to the bin
185 - finding elements in a bin by name and interface
186 - setting the clock on a pipeline.
187 - setting properties on objects
188 - linking/unlinking pads
190 These operations should take the object lock to make sure it can be
191 executed from different threads.
193 When connecting pads to other pads from elements inside another bin,
194 we require that the bin has a ghostpad for the pad. This is needed so
195 that the bin looks like a self-contained element.
198 +---------------------+
200 +---------+ | +--------+ |
201 | element | | | src | |
202 sink src------sink src- ... |
203 +---------+ | +--------+ |
204 +---------------------+
207 +-----------------------+
210 +---------+ | | src | |
211 | element | | sink src- ... |
212 sink src---sink/ +--------+ |
213 +---------+ +-----------------------+
215 This requirement is important when we need to sort the elements in the
216 bin to perfrom the state change.
221 - create a bin, add/remove elements from it
222 - add/remove from different threads and check the bin integrity.
226 An element can be in one of the four states NULL, READY, PAUSED, PLAYING.
228 NULL: starting state of the element
229 READY: element is ready to start running.
230 PAUSED: element is streaming data, has opened devices etc.
231 PLAYING: element is streaming data and clock is running
233 Note that data starts streaming even in the PAUSED state. The only difference
234 between the PAUSED and PLAYING state is that the clock is running in the
235 PLAYING state. This mostly has an effect on the renderers which will block on
236 the first sample they receive when in PAUSED mode. The transition from
237 READY->PAUSED is called the preroll state. During that transition, media is
238 queued in the pipeline and autoplugging is done.
240 Elements are put in a new state using the _set_state function. This function
241 can return the following return values:
244 GST_STATE_FAILURE = 0,
245 GST_STATE_PARTIAL = 1,
246 GST_STATE_SUCCESS = 2,
248 } GstElementStateReturn;
250 GST_STATE_FAILURE is returned when the element failed to go to the
251 required state. When dealing with a bin, this is returned when one
252 of the elements failed to go to the required state. The other elements
253 in the bin might have changed their states succesfully. This return
254 value means that the element did _not_ change state, for bins this
255 means that not all children have changed their state.
257 GST_STATE_PARTIAL is returned when some elements in a bin where in the
258 locked state and therefore did not change their state. Note that the
259 state of the bin will be changed regardless of this PARTIAL return value.
261 GST_STATE_SUCCES is returned when all the elements successfully changed their
264 GST_STATE_ASYNC is returned when an element is going to report the success
265 or failure of a state change later.
267 The state of a bin is not related to the state of its children but only to
268 the last state change directly performed on the bin or on a parent bin. This
269 means that changing the state of an element inside the bin does not affect
270 the state of the bin.
272 Setting the state on a bin that is already in the correct state will
273 perform the requested state change on the children.
275 Elements are not allowed to change their own state. For bins, it is allowed
276 to change the state of its children. This means that the application
277 can only know about the states of the elements it has explicitly set.
279 There is a difference in the way a pipeline and a bin handles the state
280 change of its children:
282 - a bin returns GST_STATE_ASYNC when one of its children returns an
285 - a pipeline never returns GST_STATE_ASYNC but returns from the state
286 change function after all ASYNC elements completed the state change.
287 This is done by polling the ASYNC elements until they return their
290 The state change function must be fast an cannot block. If a blocking behaviour
291 is unavoidable, the state change function must perform an async state change.
292 Sink elements therefore always use async state changes since they need to
293 wait before the first buffer arrives at the sink.
295 A bin has to change the state of its children elements from the sink to the
296 source elements. This makes sure that a sink element is always ready to
297 receive data from a source element in the case of a READY->PAUSED state change.
298 In the case of a PAUSED->READY state, the sink element will be set to READY
299 first so that the source element will receive an error when it tries to push
300 data to this element so that it will shut down as well.
302 For loop based elements we have to be careful since they can pull a buffer
303 from the peer element before it has been put in the right state.
304 The state of a loop based element is therefore only changed after the source
305 element has been put in the new state.
307 c) Element state change functions
309 The core will call the change_state function of an element with the element
310 lock held. The element is responsible for starting any streaming tasks/threads
311 and making sure that it synchronizes them to the state change function if
314 This means that no other thread is allowed to change the state of the element
315 at that time and for bins, it is not possible to add/remove elements.
317 When an element is busy doing the ASYNC state change, it is possible that another
318 state change happens. The elements should be prepared for this.
320 An element can receive a state change for the same state it is in. This
321 is not a problem, some elements (like bins) use this to resynchronize their
322 children. Other elements should ignore this state change and return SUCCESS.
324 When performing a state change on an element that returns ASYNC on one of
325 the state changes, ASYNC is returned and you can only proceed to the next
326 state change when this ASYNC state change completed. Use the
327 gst_element_get_state function to know when the state change completed.
328 An example of this behaviour is setting a videosink to PLAYING, it will
329 return ASYNC in the state change from READY->PAUSED. You can only set
330 it to PLAYING when this state change completes.
332 Bins will perform the state change code listed in d).
334 For performing the state change, two variables are used: the current state
335 of the element and the pending state. When the element is not performing a
336 state change, the pending state == None. The state change variables are
337 protected by the element lock. The pending state != None as long as the
338 state change is performed or when an ASYNC state change is running.
340 The core provides the following function for applications and bins to
341 get the current state of an element:
343 bool gst_element_get_state(&state, &pending, timeout);
345 This function will block while the state change function is running inside
346 the element because it grabs the element lock.
347 When the element did not perform an async state change, this function returns
348 TRUE immediatly with the state updated to reflect the current state of the
349 element and pending set to None.
350 When the element performed an async state change, this function will block
351 for the value of timeout and will return TRUE if the element completed the
352 async state change within that timeout, otherwise it returns FALSE, with
353 the current and pending state filled in.
355 The algorithm is like this:
357 bool gst_element_get_state(elem, &state, &pending, timeout)
359 g_mutex_lock (ELEMENT_LOCK);
360 if (elem->pending != none) {
361 if (!g_mutex_cond_wait(STATE, ELEMENT_LOCK, timeout) {
362 /* timeout triggered */
363 *state = elem->state;
364 *pending = elem->pending;
368 if (elem->pending == none) {
369 *state = elem->state;
373 g_mutex_unlock (ELEMENT_LOCK);
378 For plugins the following function is provided to commit the pending state,
379 the ELEMENT_LOCK should be held when calling this function:
381 gst_element_commit_state(element)
383 if (pending != none) {
387 g_cond_broadcast (STATE);
390 For bins the gst_element_get_state() works slightly different. It will run
391 the function on all of its children, as soon as one of the children returns
392 FALSE, the method returns FALSE with the state set to the current bin state
393 and the pending set to pending state.
395 For bins with elements that did an ASYNC state change, the _commit_state()
396 is only executed when actively calling _get_state(). The reason for this is
397 that when a child of the bin commits its state, this is not automatically
398 reported to the bin. This is not a problem since the _get_state() function
399 is the only way to get the current and pending state of the bin and is always
402 d) bin state change algorithm
404 In order to perform the sink to source state change a bin must be able to sort
405 the elements. To make this easier we require that elements are connected to
406 bins using ghostpads on the bin.
408 The algoritm goes like this:
410 d = [ ] # list of delayed elements
411 p = [ ] # list of pending async elements
412 q = [ elements without srcpads ] # sinks
415 s = [ all elements connected to e on the sinkpads ]
430 # a bin would return ASYNC here if p is not empty
432 # this last part is only performed by a pipeline
439 The algorithm first tries to find the sink elements, ie. ones without
440 sinkpads. Then it changes the state of each sink elements and queues
441 the elements connected to the sinkpads.
443 The entry points (loopbased and getbased elements) are delayed as we
444 first need to change the state of the other elements before we can activate
445 the entry points in the pipeline.
447 The pipeline will poll the async children before returning.
449 e) The GstTask infrastructure
451 A new component: GstTask is added to the core. A task is created by
452 an instance of the abstract GstScheduler class.
454 Each schedulable element (when added to a pipeline) is handed a
455 reference to a GstScheduler. It can use this object to create
456 a GstTask, which is basically a managed wrapper around a threading
457 library like GThread. It should be possible to write a GstScheduler
458 instance that uses other means of scheduling, like one that does not
459 use threads but implements task switching based on mutex locking.
461 When source and loopbased elements want to create the streaming thread
462 they create an instance of a GstTask, which they pass a pointer to
463 a loop-function. This function will be called as soon as the element
464 performs GstTask.start(). The element can stop and uses mutexes to
465 pause the GstTask from, for example, the state change function or the
468 The GstTasks implement the streaming threads.
472 Element start the streaming threads in the READY->PAUSED state. Since
473 the elements that start the threads are put in the PAUSED state last,
474 after their connected elements, they will be able to deliver data to
475 their peers without problems.
477 Sink elements like audio and videosinks will return an async state change
478 reply and will only commit the state change after receiving the first
479 buffer. This will implement the preroll phase.
481 The following pseudo code shows an algorithm for commiting the state
482 change in the streaming method.
484 GST_OBJECT_LOCK (element);
485 /* if we are going to PAUSED, we can commit the state change */
486 if (GST_STATE_TRANSITION (element) == GST_STATE_READY_TO_PAUSED) {
487 gst_element_commit_state (element);
489 /* if we are paused we need to wait for playing to continue */
490 if (GST_STATE (element) == GST_STATE_PAUSED) {
492 /* here we wait for the next state change */
494 g_cond_wait (element->state_cond, GST_OBJECT_GET_LOCK (element));
495 } while (GST_STATE (element) == GST_STATE_PAUSED);
497 /* check if we got playing */
498 if (GST_STATE (element) != GST_STATE_PLAYING) {
499 /* not playing, we can't accept the buffer */
500 GST_OBJECT_UNLOCK (element);
501 gst_buffer_unref (buf);
502 return GST_FLOW_WRONG_STATE;
505 GST_OBJECT_UNLOCK (element);
509 g) return values for push/pull
511 To recover from pipeline errors in a more elegant manner than just
512 shutting down the pipeline, we need more finegrained error messages
513 in the data transport. The plugins should be able to know what goes
514 wrong when interacting with their outside environment. This means
515 that gst_pad_push/gst_pad_pull and gst_event_send should return a
518 Possible return values include:
529 Data transport was successful
532 An error occured during transport, such as a fatal decoding error,
533 the pad should not be used again.
536 The pad was not connected
539 The peer does not know what datatype is going over the pipeline.
542 The peer pad is not in the correct state.
545 The peer pad did not expect the data because it was flushing or
549 The operation is not supported.
551 The signatures of the functions will become:
553 GstFlowReturn gst_pad_push (GstPad *pad, GstBuffer *buffer);
554 GstFlowReturn gst_pad_pull (GstPad *pad, GstBuffer **buffer);
556 GstResult gst_pad_push_event (GstPad *pad, GstEvent *event);
558 - push_event will send the event to the connected pad.
560 For sending events from the application:
562 GstResult gst_pad_send_event (GstPad *pad, GstEvent *event);
566 Implement a simple two phase negotiation. First the source queries the
567 sink if it accepts a certain format, then it sends the new format
568 as an event. Sink pads can also trigger a state change by requesting
571 i) Mainloop integration/GstBus
573 All error, warning and EOS messages from the plugins are sent to an event
574 queue. The pipeline reads the messages from the queue and will either
575 handle them or forward them to the main event queue that is read by the
578 Specific pipelines can be written that deal with negotiation messages and
579 errors in the pipeline intelligently. The basic pipeline will stop the
580 pipeline when an error occurs.
582 Whenever an element posts a message on the event queue, a signal is also
583 fired that can be catched by the application. When dealing with those
584 signals the application has to be aware that they come from the streaming
585 threads and need to make sure they use proper locking to protect their
588 The messages will be implemented using a GstBus object that allows
589 plugins to post messages and allows the application to read messages either
590 synchronous or asynchronous. It is also possible to integrate the bus in
593 The messages will derive from GstData to make them a lightweight refcounted
594 object. Need to figure out how we can extend this method to encapsulate
595 generic signals in messages too.
597 This decouples the streaming thread from the application thread and should
598 avoid race conditions and pipeline stalling due to application interaction.
600 It is still possible to receive the messages in the streaming thread context
601 if an application wants to. When doing this, special care has to be taken
602 when performing state changes.
606 When an element goes to EOS, it sends the EOS event to the peer plugin
607 and stops sending data on that pad. The peer element that received an EOS
608 event on a pad can refuse any buffers on that pad.
610 All elements without source pads must post the EOS message on the message
611 queue. When the pipeline receives an EOS event from all sinks, it will
612 post the EOS message on the application message queue so that the application
613 knows the pipeline is in EOS. Elements without any connected sourcepads
614 should also post the EOS message. This makes sure that all "dead-ends"
617 No state change happens when elements go to EOS but the elements with the
618 GstTask will stop their tasks and so stop producing data.
620 An application can issue a seek operation which makes all tasks running
621 again so that they can start streaming from the new location.
625 A) threads and lowlatency
627 People often think it is a sin to use threads in low latency applications. This is true
628 when using the data has to pass thread boundaries but false when it doesn't. Since
629 source and loop based elements create a thread, it is possible to construct a pipeline
630 where data passing has to cross thread boundaries, consider this case:
632 +-----------------------------------+
633 | +--------+ +--------+ |
634 | |element1| |element2| |
635 | .. -sink src-sink src- .. |
636 | +--------+ +--------+ |
637 +-----------------------------------+
639 The two elements are loop base and thus create a thread to drive the pipeline. At the
640 border between the two elements there is a mutex to pass the data between the two
641 threads. When using these kinds of element in a pipeline, low-latency will not be
642 possible. For low-latency apps, don't use these constructs!
644 Note that in a typical pipeline with one get-based element and two chain-based
645 elements (decoder/sink) there is only one thread, no data is crossing thread
646 boundaries and thus this pipeline can be low-latency. Also note that while this
647 pipeline is streaming no interaction or locking is done between it and the main
650 +-------------------------------------+
651 | +--------+ +---------+ +------+ |
652 | | src | | decoder | | sink | |
653 | | src-sink src-sink | |
654 | +--------+ +---------+ +------+ |
655 +-------------------------------------+
658 B) howto make non-threaded pipelines
660 For low latency it is required to not have datapassing cross any thread
661 borders. Here are some pointers for making sure this requirement is met:
663 - never connect a loop or chain based element to a loop based element, this
664 will create a new thread for the sink loop element.
666 - do not use queues or any other decoupled element, as they implicitly
667 create a thread boundary.
669 - At least one thread will be created for any source element (either in the
670 connected loop-based element or in the source itself) unless the source
671 elements are connected to the same loop based element.
673 - when designing sinks, make them non-blocking, use the async clock callbacks
674 to schedule media rendering in the same thread (if any) as the clock. Sinks that
675 provide the clock can be made blocking.