1. 04 Dec, 2015 1 commit
  2. 30 Nov, 2015 1 commit
    • Jian Zhou's avatar
      SSE2 speed up of h_predictor_4x4 · 9d29d762
      Jian Zhou authored
      Relocate h_predictor_4x4 from SSSE3 to SSE2 with XMM registers.
      Speed up by ~25% in ./test_intra_pred_speed.
      
      Change-Id: I64e14c13b482a471449be3559bfb0da45cf88d9d
      9d29d762
  3. 25 Nov, 2015 2 commits
  4. 23 Nov, 2015 1 commit
  5. 21 Nov, 2015 1 commit
  6. 19 Nov, 2015 2 commits
    • Jian Zhou's avatar
      Speed up h_predictor_4x4 · d76032ae
      Jian Zhou authored
      Modify h_predictor_4x4 with XMM registers.
      Speed up by ~25% in ./test_intra_pred_speed.
      
      Change-Id: Id01c34c48e75b9d56dfc2e93af12cf0c0326a279
      d76032ae
    • Jian Zhou's avatar
      Speed up tm_predictor_4x4 · 79b68626
      Jian Zhou authored
      tm_predictor_4x4 is implemented with SSE2 using XMM registers.
      Speed up by ~25% in ./test_intra_pred_speed.
      
      Change-Id: I25074b78d476a2cb17f81cf654bdfd80df2070e0
      79b68626
  7. 18 Nov, 2015 1 commit
  8. 11 Nov, 2015 1 commit
  9. 10 Nov, 2015 2 commits
  10. 20 Oct, 2015 1 commit
    • Geza Lore's avatar
      Optimize vpx_quantize_{b,b_32x32} assembler. · 9cfba09a
      Geza Lore authored
      Added optimization of the 8 bit assembly quantizer routines. This makes
      these functions up to 100% faster, depending on encoding parameters.
      
      This patch maskes the encoder faster in both the high bitdepth and 8bit
      configurations. In the high bitdepth configuration, it effects profile 0
      only.
      
      Based on my profiling using 1080p input the net gain is between 1-3% for
      the 8 bit config, and around 2.5-4.5% for the high bitdepth config,
      depending on target bitrate. The difference between the 8 bit and high
      bitdepth configurations for the same encoder run is reduced by 1% in all
      cases I have profiled.
      
      Change-Id: I86714a6b7364da20cd468cd784247009663a5140
      9cfba09a
  11. 14 Oct, 2015 1 commit
  12. 09 Oct, 2015 2 commits
  13. 06 Oct, 2015 1 commit
    • Julia Robson's avatar
      SSSE3 optimisation for quantize in high bit depth · 37c68efe
      Julia Robson authored
      When configured with high bit detpth enabled, the 8bit quantize
      function stopped using optimised code. This made 8bit content
      decode slowly. This commit re-enables the SSSE3 optimisations.
      
      Change-Id: I194b505dd3f4c494e5c5e53e020f5d94534b16b5
      37c68efe
  14. 05 Oct, 2015 2 commits
  15. 29 Sep, 2015 1 commit
    • Julia Robson's avatar
      Accelerated transform in high bit depth · 406030d1
      Julia Robson authored
      When configured with high bitdepth enabled, the 8bit transform
      stopped using optimised code. This made 8bit content decode slowly.
      
      Change-Id: I67d91f9b212921d5320f949fc0a0d3f32f90c0ea
      406030d1
  16. 18 Sep, 2015 1 commit
  17. 17 Sep, 2015 1 commit
    • James Zern's avatar
      vpx_subpixel_8t_ssse3: fix reg counts/access · 683b5a31
      James Zern authored
      fixes build on windows x64; previously 'heightq' i.e., the 64-bit register
      was accessed when only the 32-bit value was needed. given this is from a
      stack variable the upper bits were undefined.
      
      + bump register/xmm counts; users of SETUP_LOCAL_VARS touch xmm13 in
      64-bit builds and filter_block1d16_v* uses one extra temp variable
      
      Change-Id: I9c768c0b2047481d1d3b11c2e16b2f8de6eb0d80
      683b5a31
  18. 04 Sep, 2015 1 commit
    • Scott LaVarnway's avatar
      VPX: subpixel_8t_ssse3 asm using x86inc · 19588302
      Scott LaVarnway authored
      This is based on the original patch optimized for 32bit
      platforms by Tamar/Ilya and now uses the x86inc style asm.
      The assembly was also modified to support 64bit platforms.
      
      Change-Id: Ice12f249bbbc162a7427e3d23fbf0cbe4135aff2
      19588302
  19. 27 Aug, 2015 1 commit
    • Johann's avatar
      Add sse2 versions of halfpix variance · a28b2c6f
      Johann authored
      These were lost in the great sub pixel variance move of
      6a82f0d7
      
      Not having these functions caused a ~10% performance regression in
      some realtime vp8 encodes.
      
      Change-Id: I50658483d9198391806b27899f2c0d309233c4b5
      a28b2c6f
  20. 26 Aug, 2015 1 commit
  21. 20 Aug, 2015 1 commit
  22. 19 Aug, 2015 1 commit
  23. 18 Aug, 2015 1 commit
  24. 10 Aug, 2015 1 commit
  25. 07 Aug, 2015 3 commits
  26. 05 Aug, 2015 3 commits
    • Alex Converse's avatar
      Narrow a load in iwht4x4_16_add. · 05720527
      Alex Converse authored
      The top half is unused.
      
      Change-Id: I29b2f6a93e20ea43aff4ad0bd2d52257e1e752b6
      05720527
    • Scott LaVarnway's avatar
      VPX: remove scaled calls from FUN_CONV_1D · 4e6b5079
      Scott LaVarnway authored
      and FUN_CONV_2D macros.  The predict lut now handles
      this case.  The encoder now calls vpx_scaled_2d() instead
      of vpx_convolve8() for scaling.
      
      Change-Id: Ia1c8af8a31e4cb4887a587143108cb45835f7df7
      4e6b5079
    • James Zern's avatar
      Revert "VP9_COPY_CONVOLVE_SSE2 optimization" · afd2f68d
      James Zern authored
      This reverts commit a5e97d87.
      
      Additionally:
      Revert "vpx_convolve_copy_sse2: fix win64"
      
      This reverts commit 22a8474f.
      
      This change performs poorly on various x86_64 devices affecting
      performance by 1-3% at 1080P. Performance on chromebook like devices was
      mixed neutral to slightly negative, so there should be minimal change
      there.
      
      Change-Id: I95831233b4b84ee96369baa192a2d4cc7639658c
      afd2f68d
  27. 04 Aug, 2015 2 commits
    • Jingning Han's avatar
      Change vp9_quantize to vpx_quantize · d621de7e
      Jingning Han authored
      This commit clears all the vp9_ prefix use case in vpx_dsp. It gets
      the vp9 folder ready to branch out vp10.
      
      Change-Id: I2906eec179ee792b4af8c9b4161313653050e931
      d621de7e
    • Jingning Han's avatar
      Replace vp9_ prefix with vpx_ prefix in vpx_dsp function names · 08a453b9
      Jingning Han authored
      This commit clears the function naming convention in vpx_dsp. It
      replaces vp9_ prefix of global functions with vpx_ prefix. It also
      removes the vp9_ prefix from static functions.
      
      Change-Id: I6394359a63b71a51dda01342eec6a3cc08dfeedf
      08a453b9
  28. 03 Aug, 2015 1 commit
  29. 02 Aug, 2015 1 commit
  30. 01 Aug, 2015 1 commit