increased precission of s_xinc s_xinc2 (needed for the mmx2 bugfix)
moved mmx variables to top to avoid alignment issues
mmx2 code should work fine now if and only if the input width is %16=0 and the output width is %32=0
reordered some code (5% faster with a simply -benchmark)
first line bug fixed (i hope i didnt introduce any new bugs with that ...)
changed a lot of the vertical scale setup code, i hope i fixed something and didnt mess it up :)
a few known bugs left (rightmost line is wrong)
MMX2 code will only be used for upscaling & acceptable width´s
16bit dithering can be disabled
Originally committed as revision 2265 to svn://svn.mplayerhq.hu/mplayer/trunk/postproc