• levytamar82's avatar
    AVX2 Convolve Optimization · 876c72a0
    levytamar82 authored
    Two convolve functions were optimized for AVX2:
    1. vp9_filter_block1d16_h8
    2. vp9_filter_block1d16_v8
    vp9_filter_block1d16_v8 was optimized for AVX2 by reducing the number of
    loop strides by half, two strides were processed in parallel.
    vp9_filter_block1d16_v8 was also optimized in the same way also some of the
    loads were being done outside of the loop and by that preventing redundant
    loads.
    This Optimization gives 43% function level gain and 1.3% user level gain.
    Now can be compiled in Windows
    
    Change-Id: I2714124cfb0c14a77d7a0ce126a20db92ffbf92c
    876c72a0