## Supported features
-- Direct conversion of video/x-raw / non-interace(progressive) / RGB or BGRx stream to [height][width][RGB/BGRx] tensor.
-- Golden test for such inputs
+- Video: direct conversion of video/x-raw / non-interace(progressive) to [height][width][#Colorspace] tensor. (#Colorspace:width:height:1)
+ - Supported colorspaces: RGB (3), BGRx (4), Gray8 (1)
+ - You may express ```frames-per-tensor``` to have multiple image frames in a tensor like audio and text as well.
+ - If ```frames-per-tensor``` is not configured, the default value is 1.
+ - Golden tests for such input
+- Audio: direct conversion of audio/x-raw with arbitrary numbers of channels and frames per tensor to [frames-per-tensor][channels] tensor. (channels:frames-per-tensor:1:1)
+ - The number of frames per tensor is supposed to be configured manually by stream pipeline developer with the property of ```frames-per-tensor```.
+ - If ```frames-per-tensor``` is not configured, the default value is 1.
+- Text: direct conversion of text/x-raw with UTF-8 to [frames-per-tensor][1024] tensor. (1024:frames-per-tensor:1:1)
+ - The number of frames per tensor is supposed to be configured manually by stream pipeline developer with the property of ```frames-per-tensor```.
+ - If ```frames-per-tensor``` is not configured, the default value is 1.
+ - The size of a text frame, 1024, is assumed to be large enough for any single frame of strings. Because the dimension of tensor is the key metadata of a tensor stream pipeline, we need to fix the value before actually looking at the actual stream data.
+ - TODO (Schedule TBD): Allow to accept longer text frames without having larger default text frame size.
## Planned features
From higher priority
- Support other color spaces (IUV, BGGR, ...)
-- Support audio stream
-- Support text stream
-# Gstreamer standard tensor media type draft
+## Sink Pads
-- Proposed Name: other/tensor
-- Properties
- - rank: int (1: vector, 2: matrix, 3: 3-tensor, 4: 4-tensor
- - dim1: int (depth / color-RGB)
- - dim2: int (width)
- - dim3: int (height) / With version 0.0.1, this is with rstride-4.
- - dim4: int (batch. 1 for image stream)
- - type: string: int32, uint32, float32, float64, int16, uint16, int8, uint8
- - framerate; fraction
+One "Always" sink pad exists. The capability of sink pad is ```video/x-raw```, ```audio/x-raw```, and ```text/x-raw```.
- - data: (binary data, can be treated as an C array of [dim4][dim3][dim2][dim1])
+## Source Pads
-# other/tensors File Data Format, Version 1.0 Draft
+One "Always" source pad exists. The capability of source pad is ```other/tensor```. It does not support ```other/tensors``` because each frame (or a set of frames consisting a buffer) is supposed to be represented by a **single** tensor instance.
+
+For each outgoing frame (on the source pad), there always is a **single** instance of ```other/tensor```. Not less and not more.
+
+## Performance Characteristics
+
+- Video
+ - Unless it is RGB with ```width % 4 > 0``` or Gray8 with ```width % 4 > 0```, there is no memcpy or data modification processes. It only converts meta data in such cases.
+ - Otherwise, there will be one memcpy for each frame.
+- Audio
+ - TBD.
+- Text
+ - TBD.
+
+## Properties
+
+- frames-per-buffer: The number of incoming media frames that will be contained in a single instance of tensors. With the value > 1, you can put multiple frames in a single tensor.
+
+### Properties for debugging
+
+- silent: Enable/diable debugging messages.
+
+## Usage Examples
```
-Header-0: The first 20 bytes, global header
-=============================================================================================================
-| 8 bytes | 4 bytes | 4 bytes | 4 bytes |
-| The type header | Protocol Version | Number of tensors per frame | Frame size in bytes |
-| "TENSORST" | (uint32) 1 | (uint32) 1~16 (N) | (uint32) 1 ~ MAXUINT (S) |
-| | | (v.1 supports up to 16) | Counting data only (no meta) |
-=============================================================================================================
-| 20 bytes. RESERVED in v1. |
-=====================================================================================================================
-| Header-1: Following Header-1, Description of Tensor-1 of each frame (40 bytes) |
-| 4 bytes | 4 bytes | 4 bytes | 4 bytes | 4 bytes | 4 bytes | 16 bytes |
-| Element Type (enum) | RANK (uint32) | Dim-1 | Dim-2 | Dim-3 | Dim-4 | Name in string |
-| "tensor_type" in tensor_typedef.h | v.1 supports 1 to 4. | | | | | |
-=====================================================================================================================
-| ... |
-=====================================================================================================================
-| Header-N |
-=====================================================================================================================
-Data of frame-1, tensor-1 starts at the offset of (40 + 40 x N).
-Data of frame-1, tensor-i starts at the offset of (40 + 40 x N + Sum(x=1..i-1)(tensor_element_size[tensor-type of Tx] x dim1-of-Tx x dim2-of-Tx x dim3-of-Tx x dim4-of-Tx)).
-...
-Data of frame-F, tensor-1 starts at the offset of (40 + 40 x N + S x (F - 1))
-...
-Assert (S = Sum(x=1..N)(tensor_element_size[tensor-type of Tx] x dim1-of-Tx x dim2-of-Tx x dim3-of-Tx x dim4-of-Tx))
-
-Add a custom footer
+$ gst-launch videotestsrc ! video/x-raw,format=RGB,width=640,height=480 ! tensor_convert ! tensor_sink
```
-
-Note that once the stream is loaded in GStreamer, tensor\_\* elements uses the data parts only without the headers.
-The header exists only when the tensor stream is stored as a file.