• levytamar82's avatar
    AVX2 SubPixel Variance Optimization · 52dac5d1
    levytamar82 authored
    Optimizing 2 functions to process 32 elements in parallel instead of 16:
    1. vp9_sub_pixel_variance64x64
    2. vp9_sub_pixel_variance32x32
    both of those function were calling vp9_sub_pixel_variance16xh_ssse3
    instead of calling that function, it calls vp9_sub_pixel_variance32xh_avx2
    that is written in avx2 and process 32 elements in parallel.
    This Optimization gave 70% function level gain and 2% user level gain
    Change-Id: I4f5cb386b346ff6c878a094e1c3b37e418e50bde
vp9_subpel_variance_impl_intrin_avx2.c 26.6 KB