From 5f5f5e92e35c34d5515e9953f00fccfab97c4905 Mon Sep 17 00:00:00 2001 From: MyungJoo Ham Date: Wed, 5 Sep 2018 18:21:51 +0900 Subject: [PATCH] [Documentation] Update README.md of converter - Updated the status of converter. - Added more information for converter. Addressing #490 Signed-off-by: MyungJoo Ham --- gst/tensor_converter/README.md | 85 ++++++++++++++++++++---------------------- 1 file changed, 40 insertions(+), 45 deletions(-) diff --git a/gst/tensor_converter/README.md b/gst/tensor_converter/README.md index 8bb72c2..6b62ef1 100644 --- a/gst/tensor_converter/README.md +++ b/gst/tensor_converter/README.md @@ -2,60 +2,55 @@ ## Supported features -- Direct conversion of video/x-raw / non-interace(progressive) / RGB or BGRx stream to [height][width][RGB/BGRx] tensor. -- Golden test for such inputs +- Video: direct conversion of video/x-raw / non-interace(progressive) to [height][width][#Colorspace] tensor. (#Colorspace:width:height:1) + - Supported colorspaces: RGB (3), BGRx (4), Gray8 (1) + - You may express ```frames-per-tensor``` to have multiple image frames in a tensor like audio and text as well. + - If ```frames-per-tensor``` is not configured, the default value is 1. + - Golden tests for such input +- Audio: direct conversion of audio/x-raw with arbitrary numbers of channels and frames per tensor to [frames-per-tensor][channels] tensor. (channels:frames-per-tensor:1:1) + - The number of frames per tensor is supposed to be configured manually by stream pipeline developer with the property of ```frames-per-tensor```. + - If ```frames-per-tensor``` is not configured, the default value is 1. +- Text: direct conversion of text/x-raw with UTF-8 to [frames-per-tensor][1024] tensor. (1024:frames-per-tensor:1:1) + - The number of frames per tensor is supposed to be configured manually by stream pipeline developer with the property of ```frames-per-tensor```. + - If ```frames-per-tensor``` is not configured, the default value is 1. + - The size of a text frame, 1024, is assumed to be large enough for any single frame of strings. Because the dimension of tensor is the key metadata of a tensor stream pipeline, we need to fix the value before actually looking at the actual stream data. + - TODO (Schedule TBD): Allow to accept longer text frames without having larger default text frame size. ## Planned features From higher priority - Support other color spaces (IUV, BGGR, ...) -- Support audio stream -- Support text stream -# Gstreamer standard tensor media type draft +## Sink Pads -- Proposed Name: other/tensor -- Properties - - rank: int (1: vector, 2: matrix, 3: 3-tensor, 4: 4-tensor - - dim1: int (depth / color-RGB) - - dim2: int (width) - - dim3: int (height) / With version 0.0.1, this is with rstride-4. - - dim4: int (batch. 1 for image stream) - - type: string: int32, uint32, float32, float64, int16, uint16, int8, uint8 - - framerate; fraction +One "Always" sink pad exists. The capability of sink pad is ```video/x-raw```, ```audio/x-raw```, and ```text/x-raw```. - - data: (binary data, can be treated as an C array of [dim4][dim3][dim2][dim1]) +## Source Pads -# other/tensors File Data Format, Version 1.0 Draft +One "Always" source pad exists. The capability of source pad is ```other/tensor```. It does not support ```other/tensors``` because each frame (or a set of frames consisting a buffer) is supposed to be represented by a **single** tensor instance. + +For each outgoing frame (on the source pad), there always is a **single** instance of ```other/tensor```. Not less and not more. + +## Performance Characteristics + +- Video + - Unless it is RGB with ```width % 4 > 0``` or Gray8 with ```width % 4 > 0```, there is no memcpy or data modification processes. It only converts meta data in such cases. + - Otherwise, there will be one memcpy for each frame. +- Audio + - TBD. +- Text + - TBD. + +## Properties + +- frames-per-buffer: The number of incoming media frames that will be contained in a single instance of tensors. With the value > 1, you can put multiple frames in a single tensor. + +### Properties for debugging + +- silent: Enable/diable debugging messages. + +## Usage Examples ``` -Header-0: The first 20 bytes, global header -============================================================================================================= -| 8 bytes | 4 bytes | 4 bytes | 4 bytes | -| The type header | Protocol Version | Number of tensors per frame | Frame size in bytes | -| "TENSORST" | (uint32) 1 | (uint32) 1~16 (N) | (uint32) 1 ~ MAXUINT (S) | -| | | (v.1 supports up to 16) | Counting data only (no meta) | -============================================================================================================= -| 20 bytes. RESERVED in v1. | -===================================================================================================================== -| Header-1: Following Header-1, Description of Tensor-1 of each frame (40 bytes) | -| 4 bytes | 4 bytes | 4 bytes | 4 bytes | 4 bytes | 4 bytes | 16 bytes | -| Element Type (enum) | RANK (uint32) | Dim-1 | Dim-2 | Dim-3 | Dim-4 | Name in string | -| "tensor_type" in tensor_typedef.h | v.1 supports 1 to 4. | | | | | | -===================================================================================================================== -| ... | -===================================================================================================================== -| Header-N | -===================================================================================================================== -Data of frame-1, tensor-1 starts at the offset of (40 + 40 x N). -Data of frame-1, tensor-i starts at the offset of (40 + 40 x N + Sum(x=1..i-1)(tensor_element_size[tensor-type of Tx] x dim1-of-Tx x dim2-of-Tx x dim3-of-Tx x dim4-of-Tx)). -... -Data of frame-F, tensor-1 starts at the offset of (40 + 40 x N + S x (F - 1)) -... -Assert (S = Sum(x=1..N)(tensor_element_size[tensor-type of Tx] x dim1-of-Tx x dim2-of-Tx x dim3-of-Tx x dim4-of-Tx)) - -Add a custom footer +$ gst-launch videotestsrc ! video/x-raw,format=RGB,width=640,height=480 ! tensor_convert ! tensor_sink ``` - -Note that once the stream is loaded in GStreamer, tensor\_\* elements uses the data parts only without the headers. -The header exists only when the tensor stream is stored as a file. -- 2.7.4