1. 29 Nov, 2017 1 commit
    • Shiyou Yin's avatar
      vpx: [loongson] fix bug in var_filter_block2d_bil_16x · a0ca2a40
      Shiyou Yin authored
      Which cause failed case:
      1. MMI/VpxSubpelVarianceTest.Ref/6
      2. MMI/VpxSubpelVarianceTest.Ref/7
      3. MMI/VpxSubpelVarianceTest.ExtremeRef/6
      4. MMI/VpxSubpelVarianceTest.ExtremeRef/7
      
      Change-Id: I122ca20089e14ac324edd61295cf8f506e06afc8
      a0ca2a40
  2. 27 Nov, 2017 1 commit
  3. 17 Nov, 2017 1 commit
  4. 15 Nov, 2017 1 commit
    • Johann's avatar
      fwd txfm ssse3: use GLOBAL() for loading constants · 3e3a5686
      Johann authored
      Fixes a build issue when relocation is not allowed:
      relocation R_X86_64_32 against '.rodata' can not be used when making a shared object
      
      Change-Id: Ica3e90c926847bc384e818d7854f0030f4d69aa0
      3e3a5686
  5. 10 Nov, 2017 1 commit
    • Scott LaVarnway's avatar
      vpx: [x86] add vpx_satd_avx2() · 8e602284
      Scott LaVarnway authored
      SSE2 instrinsic vs AVX2 intrinsic speed gains:
      blocksize   16: ~1.33
      blocksize   64: ~1.51
      blocksize  256: ~3.03
      blocksize 1024: ~3.71
      
      Change-Id: I79b28cba82d21f9dd765e79881aa16d24fd0cb58
      8e602284
  6. 09 Nov, 2017 1 commit
  7. 03 Nov, 2017 1 commit
  8. 26 Oct, 2017 1 commit
  9. 24 Oct, 2017 1 commit
    • Kyle Siefring's avatar
      Optimize convolve8 SSSE3 and AVX2 intrinsics · ae35425a
      Kyle Siefring authored
      Changed the intrinsics to perform summation similiar to the way the assembly does.
      
      The new code diverges from the assembly by preferring unsaturated additions.
      
      Results for haswell
      
      SSSE3
      Horiz/Vert  Size  Speedup
      Horiz       x4    ~32%
      Horiz       x8    ~6%
      Vert        x8    ~4%
      
      AVX2
      Horiz/Vert  Size  Speedup
      Horiz       x16   ~16%
      Vert        x16   ~14%
      
      BUG=webm:1471
      
      Change-Id: I7ad98ea688c904b1ba324adf8eb977873c8b8668
      ae35425a
  10. 23 Oct, 2017 1 commit
  11. 20 Oct, 2017 1 commit
  12. 19 Oct, 2017 1 commit
  13. 17 Oct, 2017 1 commit
  14. 16 Oct, 2017 1 commit
    • Linfeng Zhang's avatar
      Add 4 to 3 scaling SSSE3 optimization · 580d3224
      Linfeng Zhang authored
      Note this change will trigger the different C version on SSSE3 and
      generate different scaled output.
      
      Its speed is 2x compared with the version calling vpx_scaled_2d_ssse3().
      
      Change-Id: I17fff122cd0a5ac8aa451d84daa606582da8e194
      580d3224
  15. 10 Oct, 2017 1 commit
  16. 09 Oct, 2017 1 commit
  17. 08 Oct, 2017 1 commit
    • Kyle Siefring's avatar
      Add AVX2 version of vpx_convolve8_avg. · 9ca06bcd
      Kyle Siefring authored
      vpx_convolve8_avg works by first running a normal horizontal filter then a
      vertical filter averages at the end.
      
      The added vpx_convolve8_avg_avx2 calls pre-existing AVX2 code for the
      horizontal step.
      
      vpx_convolve8_avg_vert_avx2 is also added, but only uses ssse3 code.
      
      Change-Id: If5160c0c8e778e10de61ee9bf42ee4be5975c983
      9ca06bcd
  18. 04 Oct, 2017 1 commit
  19. 03 Oct, 2017 4 commits
  20. 30 Sep, 2017 1 commit
  21. 29 Sep, 2017 1 commit
  22. 28 Sep, 2017 1 commit
  23. 27 Sep, 2017 2 commits
  24. 26 Sep, 2017 1 commit
  25. 22 Sep, 2017 1 commit
  26. 20 Sep, 2017 3 commits
  27. 19 Sep, 2017 4 commits
  28. 18 Sep, 2017 1 commit
  29. 14 Sep, 2017 1 commit
    • Kaustubh Raste's avatar
      mips msa clean-up msa macros · 4ca8f8f5
      Kaustubh Raste authored
      Removed inline for GP load-store in case of (__mips_isa_rev >= 6)
      Created one define LD_V for vector load and ST_V for vector store
      
      Change-Id: Ifec3570fa18346e39791b0dd622892e5c18bd448
      4ca8f8f5
  30. 13 Sep, 2017 1 commit
  31. 12 Sep, 2017 1 commit
    • Johann's avatar
      Revert "Revert "quantize avx: copy 32x32 implementation"" · eb4238ac
      Johann authored
      This reverts commit 8c42237b.
      
      Because ssse3 code is used for the reference, the qcoeff and dqcoeff
      reference buffers must be aligned.
      
      Original change's description:
      > quantize avx: copy 32x32 implementation
      >
      > Ensure avx and ssse3 stay in sync by testing them against each other.
      >
      > Change-Id: I699f3b48785c83260825402d7826231f475f697c
      
      Change-Id: Ieeef11b9406964194028b0d81d84bcb63296ae06
      eb4238ac