From 575e50fbbc96f217162fcf6cc0afe5f766724d85 Mon Sep 17 00:00:00 2001 From: Stefan Kost Date: Wed, 9 Sep 2009 09:38:54 +0300 Subject: [PATCH] design: add ideas for buffer management Right now we're operating suboptimal when talking to kernel interfaces. Write doesn some ideas. --- docs/random/ensonic/draft-bufferpools.txt | 115 ++++++++++++++++++++++++++++++ 1 file changed, 115 insertions(+) create mode 100644 docs/random/ensonic/draft-bufferpools.txt diff --git a/docs/random/ensonic/draft-bufferpools.txt b/docs/random/ensonic/draft-bufferpools.txt new file mode 100644 index 0000000..8bab08e --- /dev/null +++ b/docs/random/ensonic/draft-bufferpools.txt @@ -0,0 +1,115 @@ +BufferPools +----------- + +This document proposes a mechnism to build pools of reusable buffers. The +proposal should improve performance and help to implement zero-copy usecases. + +Last edited: 2009-09.01 Stefan Kost + + +Current Behaviour +----------------- + +Elements either create own buffers or request downstream buffers via pad_alloc. +There is hardly any reuse of buffers, instead they are ususaly disposed after +being rendered. + + +Problems +-------- + + - hardware based elements like to reuse buffers as they e.g. + - mlock them (dsp) + - establish a index<->adress relation (v4l2) + - not reusing buffers has overhead and makes run time behaviour + non-deterministic: + - malloc (which usualy becomes an mmap for bigger buffers and thus a + syscall) and free (can trigger compression of freelists in the allocator) + - shm alloc/attach, detach/free (xvideo) + - some usecases cause memcpys + - not having the right amount of buffers (e.g. too few buffers in v4l2src) + - receiving buffers of wrong type (e.g. plain buffers in xvimagesink) + - receving buffers with wrong alignment (dsp) + - some usecases cause unneded cacheflushes when buffers are passed between + user and kernel-space + + +What is needed +-------------- + +Elements that sink raw data buffers of usualy constant size would like to +maintain a bufferpool. These could be sinks or encoders. We need mechanims to +select and dynamicaly update: + + - the bufferpool owners in a pipeline + - the bufferpool sizes + - the queued buffer sizes, alignments and flags + + +Proposal +-------- +Querying the bufferpool size and buffer alignments can work simillar to latency +queries (gst/gstbin.c:{gst_bin_query,bin_query_latency_fold}. Aggregation is +quite straight forward : number-of-buffers is summed up and for alignment we +gather the MAX value. + +Bins need to track which elemnts have been selected as bufferpools owners and +update if those are removed (FIXME: in which states?). + +Bins would also need to track if elements that replied to the query are removed +and update the bufferpool configuration (event). Likewise addition of new +elements needs to be handled (query and if configuration is changed, update with +event). + +Bufferpools owners need to handle caps changes to keep the queued buffers valid +for the negotiated format. + +The bufferpool could be a helper GObject (like we use GstAdapter). If would +manage a collection of GstBuffers. For each buffer t tracks wheter its in use or +available. The bufferpool in gst-plgin-good/sys/v4l2/gstv4l2bufferpool might be +a starting point. + + +Scenarios +--------- + +v4l2src ! xvimagesink +~~~~~~~~~~~~~~~~~~~~~ +- v4l2src would report 1 buffer (do we still want the queue-size property?) +- xvimagesink would report 1 buffer + +v4l2src ! tee name=t ! queue ! xvimagesink t. ! queue ! enc ! mux ! filesink +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +- v4l2src would report 1 buffer +- xvimagesink would report 1 buffer +- enc would report 1 buffer + +filesrc ! demux ! queue ! dec ! xvimagesink +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +- dec would report 1 buffer +- xvimagesink would report 1 buffer + + +Issues +------ + +Does it make sense to also have pools for sources or should they always use +buffers from a downstream element. + +Do we need to add +1 to aggregated buffercount to alloc to have a buffer +floating? E.g. Can we push buffers queickly enough to have e.g. v4l2src ! +xvimagesink working with 2 buffers. What about v4l2src ! queue ! xvimagesink? + +There are more attributes on buffers needed to reduce the overhead even more: + + - padding: when using buffers on hardware one might need to pad the buffer on + the end to a specific alignment + - mlock: hardware that uses DMA needs buffers memory locked, if a buffer is + already memory locked, it can be used by other hardware based elements as is + - cache flushes: hardware based elements usualy need to flush cpu caches when + sending results as the dma based memory writes do no update eventually + cached values on the cpu. now if there is no element next in the pipeline + that actually reads from this memory area we could avoid the flushes. All + other hardware elements and elements with any caps (tee, queue, capsfilter) + are examples for those. + -- 2.7.4