libavcodec/x86/dsputil_yasm.asm · b1159ad92818cd8f0885d252b0800f5960fe7241 · BC / public / external / ffmpeg

refactor and optimize scalarproduct · b1159ad9

Loren Merritt authored Dec 05, 2009

29-105% faster apply_filter, 6-90% faster ape decoding on core2
(Any x86 other than core2 probably gets much less, since this is mostly due to ssse3 cachesplit avoidance and I haven't written the full gamut of other cachesplit modes.)
9-123% faster ape decoding on G4.

Originally committed as revision 20739 to svn://svn.ffmpeg.org/ffmpeg/trunk

b1159ad9