-
Jian Zhou authored
Relocate the function from SSSE3 to SSE2, Unroll loop from 16 to 8, and reduce mem access to left. Speed up by single digit in ./test_intra_pred_speed on big core machines. Change-Id: I2b7fc95ffc0c42145be2baca4dc77116dff1c960
c90a8a1a