• Jingning Han's avatar
    Rework forward 8x8 2D-DCT ssse3 implementation · 9a780fa7
    Jingning Han authored
    This commit reworks the SSSE3 implementation of the forward 8x8
    2D-DCT. It uses a cyclic rotation approach to the temporary xmm
    registers. It reduces the average cycles from 158 to 154. The SSE2
    version uses 169 cycles.
    
    Change-Id: I1b79b9642aae0ed3fb3cefb5b70246e6de5d5caa
    9a780fa7