1. 20 Sep, 2017 1 commit
  2. 13 Jun, 2017 2 commits
  3. 16 Feb, 2017 1 commit
    • Johann's avatar
      Drop zbin_ptr and quant_shift_ptr · ca4e27f5
      Johann authored
      vp9[_highbd]_quantize]_fp[_32x32] and vp9_fdct8x8_quant do not make use
      of these parameters.
      
      scan is used for C code and iscan is used for SIMD implementations.
      
      Change-Id: I908a0ff7d3febac33da97e0596e040ec7bc18ca5
      ca4e27f5
  4. 02 Aug, 2016 1 commit
  5. 27 May, 2016 1 commit
    • Linfeng Zhang's avatar
      Upgrade fwht4x4_mmx() to fwht4x4_sse2() for vp9 and vp10. · af7fb17c
      Linfeng Zhang authored
      Function level timing test shows about 27% time saving on
      a Xeon E5-2680 v2 desktop.
      
      Rename vp9_dct_sse2.c to vp9_dct_intrin_sse2.c for vp9 and
      rename dct_sse2.c to dct_intrin_sse2.c for vp10 to avoid
      duplicate basenames.
      
      Actually vp9_fwht4x4_mmx/sse2() and vp10_fwht4x4_mmx/sse2()
      are identical. TODO: They should be unified later if there is
      no intention to keep a duplicate.
      
      Change-Id: I3e537b7bbd9ba417c606cd7c68c4dbbfa583f77d
      af7fb17c
  6. 28 Jul, 2015 3 commits
  7. 27 Jul, 2015 1 commit
  8. 26 Jul, 2015 1 commit
    • Jingning Han's avatar
      Refactor vp9_idct.h file · 5ebc8feb
      Jingning Han authored
      Separate the common coefficient constant into vpx_dsp/txfm_common.h.
      Move the SSE2 macro definitions to vpx_dsp/x86/txfm_common_sse2.h.
      This clears the use case of vp9_idct.h in vpx_dsp folder.
      
      Change-Id: I319735a2abf42888e5080ac14cfbcde34be7b121
      5ebc8feb
  9. 24 Jul, 2015 1 commit
  10. 22 Jul, 2015 1 commit
  11. 20 Jul, 2015 1 commit
    • Jingning Han's avatar
      Unify the high bit-depth forward hybrid transforms · e253eaa0
      Jingning Han authored
      The SSE2 version high bit-depth forward hybrid transforms are
      essentially using the C functions via cross referencing to 1-D
      functions in vp9_dct.c. This commit unifies the two versions and
      removes the unnecessary dependency.
      
      Change-Id: Ib4d0702a138f8daf7d0bd97c141ee7088f293765
      e253eaa0
  12. 16 May, 2015 1 commit
    • James Zern's avatar
      rename vp9_dct_impl_sse2.c to vp9_dct_sse2_impl.h · a989c66b
      James Zern authored
      this file shouldn't be built directly, it is included in vp9_dct_sse2.c
      to create a non-high-bitdepth and a high-bitdepth version
      
      silences missing prototype warnings for the unused FDCT* functions
      
      Change-Id: Ide6ff8c24ab31bdb0f833260505ae33660a1ad5b
      a989c66b
  13. 15 May, 2015 3 commits
  14. 18 Mar, 2015 1 commit
  15. 25 Feb, 2015 1 commit
  16. 22 Dec, 2014 1 commit
  17. 19 Dec, 2014 1 commit
  18. 18 Dec, 2014 1 commit
  19. 04 Dec, 2014 1 commit
  20. 02 Dec, 2014 1 commit
  21. 19 Nov, 2014 1 commit
    • Jingning Han's avatar
      Combine fdct8x8 and quantization process · c6908fd5
      Jingning Han authored
      This commit reworks the forward transform and quantization process
      for 8x8 block coding. It combines the two operations in a single
      function to save a store/load stage of the original transform
      coefficients. Overall the speed -6 is slightly faster (around 1%
      range). The compression performance of speed -6 is improved by
      3.4%.
      
      Change-Id: Id6628daef123f3e4649248735ec2ad7423629387
      c6908fd5
  22. 05 Nov, 2014 1 commit
  23. 03 Sep, 2014 1 commit
  24. 20 Aug, 2014 1 commit
  25. 12 Jun, 2014 1 commit
    • Jingning Han's avatar
      Fast computation path for forward transform and quantization · ccba289f
      Jingning Han authored
      This commit enables a fast path computational flow for forward
      transformation. It checks the sse and variance of prediction
      residuals and decides if the quantized coefficients are all
      zero, dc only, or more. It then selects the corresponding coding
      path in the forward transformation and quantization stage.
      
      It is currently enabled in rtc coding mode. Will do it for rd
      coding mode next.
      
      In speed -6, the runtime for pedestrian_area 1080p at 1000 kbps
      goes down from 14234 ms to 13704 ms, i.e., about 4% speed-up.
      Overall coding performance for rtc set is changed by -0.18%.
      
      Change-Id: I0452da1786d59bc8bcbe0a35fdae9f623d1d44e1
      ccba289f
  26. 19 May, 2014 1 commit
    • Jingning Han's avatar
      Adjust the forward 16x16 DCT computation steps · 7f547336
      Jingning Han authored
      This commit adjusts the forward 16x16 DCT computation steps to
      simplify the register level operations. It fixes the corresponding
      sse2 version accordingly.
      
      Change-Id: I72a9c25b8ca9442fc5e113f47cd701ae55aa7f08
      7f547336
  27. 03 Mar, 2014 1 commit
    • Andrew Russell's avatar
      improved speed of 4x4 sse2 fdct. · a46f5459
      Andrew Russell authored
      * speed improvment of 30 percent achieved
      * multiplies and adds remain the same
      * non-arithmetic instructions minimized by hand, by:
         -expanding 2 pass loop
         -removing irrelivant "shuffles"
         -combining last two rounding steps
      * further improvments may be possible
      
      Change-Id: Idec2c3f52910c48e6a0e0f9aefed5cae31b0b8c0
      a46f5459
  28. 13 Feb, 2014 1 commit
  29. 06 Feb, 2014 1 commit
  30. 28 Jan, 2014 1 commit
  31. 21 Nov, 2013 1 commit
    • Abo Talib Mahfoodh's avatar
      Improve vp9_fdct4x4_sse2 (x1.2) · ec2dbdd1
      Abo Talib Mahfoodh authored
      Modifications are done to reduce the total clock cycle.
      Speedup: 1.2
      
      Tested with: park_joy_420_720p50.y4m
      
      Change-Id: Ia36b87e62e2f80a5fadaf5628729aedc80f38f3f
      ec2dbdd1
  32. 13 Nov, 2013 1 commit
    • Jingning Han's avatar
      Fix an overflow issue in SSE2 forward ADST · fabc7836
      Jingning Han authored
      The step that sums three input samples could potentially cause the
      intermediate result go beyond 16 bit limit, when operating as the
      second 1-D transform. This commit fixes the issue.
      
      Change-Id: Iaf512449ac2d25ddd8a806d760afab362c62a516
      fabc7836
  33. 24 Oct, 2013 1 commit
  34. 23 Oct, 2013 2 commits