From 38d174b37573cf8c2abfac19e1af67f9022d617c Mon Sep 17 00:00:00 2001 From: Michael Niedermayer Date: Sun, 14 Sep 2008 02:38:47 +0000 Subject: [PATCH] The official guide to swscale for confused developers. Originally committed as revision 15316 to svn://svn.ffmpeg.org/ffmpeg/trunk --- doc/swscale.txt | 98 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 98 insertions(+) create mode 100644 doc/swscale.txt diff --git a/doc/swscale.txt b/doc/swscale.txt new file mode 100644 index 0000000..b078f27 --- /dev/null +++ b/doc/swscale.txt @@ -0,0 +1,98 @@ + The official guide to swscale for confused developers. + ======================================================== + +Current (simplified) Architecture: +--------------------------------- + Input + v + _______OR_________ + / \ + / \ + special converter [Input to YUV converter] + | | + | (8bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:0:0 ) + | | + | v + | Horizontal scaler + | | + | (15bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:1:1 / 4:0:0 ) + | | + | v + | Vertical scaler and output converter + | | + v v + output + + +Swscale has 2 scaler pathes, each side must be capable to handle +slices, that is consecutive non overlapping rectangles of dimension +(0,slice_top) - (picture_width, slice_bottom) + +special converter + This generally are unscaled converters of common + formats, like YUV 4:2:0/4:2:2 -> RGB15/16/24/32. Though it could also + in principle contain scalers optimized for specific common cases. + +Main path + The main path is used when no special converter can be used, the code + is designed as a destination line pull architecture. That is for each + output line the vertical scaler pulls lines from a ring buffer that + when the line is unavailable pulls it from the horizontal scaler and + input converter of the current slice. + When no more output can be generated as lines from a next slice would + be needed then all remaining lines in the current slice are converted + and horizontally scaled and put in the ring buffer. + [this is done for luma and chroma, each with possibly different numbers + of lines per picture] + +Input to YUV Converter + When the input to the main path is not planar 8bit per component yuv or + 8bit gray then it is converted to planar 8bit YUV, 2 sets of converters + exist for this currently one performing horizontal downscaling by 2 + before the convertion and the other leaving the full chroma resolution + but being slightly slower. The scaler will try to preserve full chroma + here when the output uses it, its possible to force full chroma with + SWS_FULL_CHR_H_INP though even for cases where the scaler thinks its + useless. + +Horizontal scaler + There are several horizontal scalers, a special case worth mentioning is + the fast bilinear scaler that is made of runtime generated mmx2 code + using specially tuned pshufw instructions. + The remaining scalers are specially tuned for various filter lengths + they scale 8bit unsigned planar data to 16bit signed planar data. + Future >8bit per component inputs will need to add a new scaler here + that preserves the input precission. + +Vertical scaler and output converter + There is a large number of combined vertical scalers+output converters + Some are: + * unscaled output converters + * unscaled output converters that average 2 chroma lines + * bilinear converters (C, MMX and accurate MMX) + * arbitrary filter length converters (C, MMX and accurate MMX) + And + * Plain C 8bit 4:2:2 YUV -> RGB converters using LUTs + * Plain C 17bit 4:4:4 YUV -> RGB converters using multiplies + * MMX 11bit 4:2:2 YUV -> RGB converters + * Plain C 16bit Y -> 16bit gray + ... + + RGB with less than 8bit per component uses dither to improve the + subjective quality and low frequency accuracy. + + +Filter coefficients: +-------------------- +There are several different scalers (bilinear, bicubic, lanczos, area, sinc, ...) +Their coefficients are calculated in initFilter(). +Horinzontal filter coeffs have a 1.0 point at 1<<14, vertical ones at 1<<12. +The 1.0 points have been choosen to maximize precission while leaving a +little headroom for convolutional filters like sharpening filters and +minimizing SIMD instructions needed to apply them. +It would be trivial to use a different 1.0 point if some specific scaler +would benefit from it. +Also as already hinted at initFilter() accepts an optional convolutional +filter as input that can be used for contrast, saturation, blur, sharpening +shift, chroma vs. luma shift, ... + -- 2.7.4