1. 03 Nov, 2017 1 commit
  2. 26 Oct, 2017 1 commit
  3. 24 Oct, 2017 1 commit
    • Kyle Siefring's avatar
      Optimize convolve8 SSSE3 and AVX2 intrinsics · ae35425a
      Kyle Siefring authored
      Changed the intrinsics to perform summation similiar to the way the assembly does.
      
      The new code diverges from the assembly by preferring unsaturated additions.
      
      Results for haswell
      
      SSSE3
      Horiz/Vert  Size  Speedup
      Horiz       x4    ~32%
      Horiz       x8    ~6%
      Vert        x8    ~4%
      
      AVX2
      Horiz/Vert  Size  Speedup
      Horiz       x16   ~16%
      Vert        x16   ~14%
      
      BUG=webm:1471
      
      Change-Id: I7ad98ea688c904b1ba324adf8eb977873c8b8668
      ae35425a
  4. 23 Oct, 2017 1 commit
  5. 20 Oct, 2017 1 commit
  6. 19 Oct, 2017 1 commit
  7. 17 Oct, 2017 1 commit
  8. 16 Oct, 2017 1 commit
    • Linfeng Zhang's avatar
      Add 4 to 3 scaling SSSE3 optimization · 580d3224
      Linfeng Zhang authored
      Note this change will trigger the different C version on SSSE3 and
      generate different scaled output.
      
      Its speed is 2x compared with the version calling vpx_scaled_2d_ssse3().
      
      Change-Id: I17fff122cd0a5ac8aa451d84daa606582da8e194
      580d3224
  9. 10 Oct, 2017 1 commit
  10. 09 Oct, 2017 1 commit
  11. 08 Oct, 2017 1 commit
    • Kyle Siefring's avatar
      Add AVX2 version of vpx_convolve8_avg. · 9ca06bcd
      Kyle Siefring authored
      vpx_convolve8_avg works by first running a normal horizontal filter then a
      vertical filter averages at the end.
      
      The added vpx_convolve8_avg_avx2 calls pre-existing AVX2 code for the
      horizontal step.
      
      vpx_convolve8_avg_vert_avx2 is also added, but only uses ssse3 code.
      
      Change-Id: If5160c0c8e778e10de61ee9bf42ee4be5975c983
      9ca06bcd
  12. 04 Oct, 2017 1 commit
  13. 03 Oct, 2017 4 commits
  14. 29 Sep, 2017 1 commit
  15. 28 Sep, 2017 1 commit
  16. 27 Sep, 2017 2 commits
  17. 22 Sep, 2017 1 commit
  18. 20 Sep, 2017 3 commits
  19. 19 Sep, 2017 2 commits
  20. 12 Sep, 2017 1 commit
    • Johann's avatar
      Revert "Revert "quantize avx: copy 32x32 implementation"" · eb4238ac
      Johann authored
      This reverts commit 8c42237b.
      
      Because ssse3 code is used for the reference, the qcoeff and dqcoeff
      reference buffers must be aligned.
      
      Original change's description:
      > quantize avx: copy 32x32 implementation
      >
      > Ensure avx and ssse3 stay in sync by testing them against each other.
      >
      > Change-Id: I699f3b48785c83260825402d7826231f475f697c
      
      Change-Id: Ieeef11b9406964194028b0d81d84bcb63296ae06
      eb4238ac
  21. 11 Sep, 2017 1 commit
  22. 07 Sep, 2017 1 commit
  23. 05 Sep, 2017 2 commits
  24. 30 Aug, 2017 1 commit
  25. 29 Aug, 2017 2 commits
  26. 25 Aug, 2017 1 commit
    • Marco Paniconi's avatar
      Revert "quantize avx: copy 32x32 implementation" · 8c42237b
      Marco Paniconi authored
      This reverts commit f60d1dcd.
      
      Reason for revert: <INSERT REASONING HERE>
      Failures in AVX/VP9QuantizeTest in nightly tests.
      Original change's description:
      > quantize avx: copy 32x32 implementation
      > 
      > Ensure avx and ssse3 stay in sync by testing them against each other.
      > 
      > Change-Id: I699f3b48785c83260825402d7826231f475f697c
      
      TBR=slavarnway@google.com,johannkoenig@google.com,builds@webmproject.org
      
      Change-Id: Ibd38636212269328317dd0721be9d25452113d1c
      No-Presubmit: true
      No-Tree-Checks: true
      No-Try: true
      8c42237b
  27. 24 Aug, 2017 2 commits
  28. 23 Aug, 2017 1 commit
    • Johann's avatar
      quantize avx: copy implementation to intrinsics · 7c278721
      Johann authored
      Adds an early exit based on ptest. Slightly slower than ssse3 in the
      full case because of the extra check, but potentially faster if lots of
      rows can be skipped.
      
      Very close in speed to the assembly.
      
      Can run in 32 bit, unlike the assembly. Allows reworking the function
      prototype to use structs.
      
      Change-Id: If80e2b9ba059370a4cad3c973196e82a97b4330e
      7c278721
  29. 22 Aug, 2017 2 commits