1. 24 Sep, 2013 - 1 commit
  2. 11 Sep, 2013 - 1 commit
    • Scott LaVarnway's avatar
      New mode_info_context storage -- undo revert · ac6093d1
      Scott LaVarnway authored
      mode_info_context was stored as a grid of MODE_INFO structs.
      The grid now constists of pointers to MODE_INFO structs.  The
      MODE_INFO structs are now stored as a stream (decoder only),
      eliminating unnecessary copies and is a little more cache
      friendly.
      
      Change-Id: I031d376284c6eb98a38ad5595b797f048a6cfc0d
      ac6093d1
  3. 10 Sep, 2013 - 1 commit
    • Jingning Han's avatar
      Remove redundant condition check in 32x32 quant · 5d93feb6
      Jingning Han authored
      The c code implementation of 32x32 quantization does the zbin check
      of all coefficients prior to the quant/dequant loop, hence removing
      the redundant zbin check inside the loop. This only affects the
      c code version. SSSE3 version does not separate the zbin check out.
      
      Change-Id: Ic197a7d61d0b25fcac3cc092987651378cb56e4e
      5d93feb6
  4. 09 Sep, 2013 - 1 commit
  5. 07 Sep, 2013 - 1 commit
    • Jingning Han's avatar
      Fix overflow issue in 16x16 quantization SSSE3 · 09bc942b
      Jingning Han authored
      The 16x16 transform unit test suggested that the peak coefficient
      value can reach 32639. This could cause potential overflow issue
      in the SSSE3 implmentation of 16x16 block quantization. This commit
      fixes this issue by replacing addition with saturated addition.
      
      Change-Id: I6d5bb7c5faad4a927be53292324bd2728690717e
      09bc942b
  6. 06 Sep, 2013 - 1 commit
    • Scott LaVarnway's avatar
      New mode_info_context storage · dae17734
      Scott LaVarnway authored
      mode_info_context was stored as a grid of MODE_INFO structs.
      The grid now constists of a pointer to a MODE_INFO struct and
      a "in the image" flag.  The MODE_INFO structs are now stored
      as a stream, eliminating unnecessary copies and is a little
      more cache friendly.
      
      For the test clips used, the decoder performance improved
      by ~4.3% (1080p) and ~9.7% (720p).
      
      Patch Set 2: Re-encoded clips with latest. Now ~1.7% (1080p)
      and 5.9% (720p).
      
      Change-Id: I846f29e88610fce2523ca697a9a9ef2a182e9256
      dae17734
  7. 05 Sep, 2013 - 1 commit
    • Jingning Han's avatar
      Use saturated addition in SSSE3 of 32x32 quant · 458c2833
      Jingning Han authored
      The 32x32 forward transform can potentially reach peak coefficient
      value close to 32700, while the rounding factor can go upto 610.
      This could cause overflow issue in the SSSE3 implementation of 32x32
      quantization process.
      
      This commit resolves this issue by replacing the addition operations
      with saturated addition operations in 32x32 block quantization.
      
      Change-Id: Id6b98996458e16c5b6241338ca113c332bef6e70
      458c2833
  8. 29 Aug, 2013 - 1 commit
    • Jingning Han's avatar
      Fix overflow issue in SSSE3 32x32 quantization · abff6788
      Jingning Han authored
      The 32x32 quantization process can potentially have the intermediate
      stacks over 16-bit range, thereby causing enc/dec mismatch. This commit
      fixes this overflow issue in the SSSE3 implementation, as well as the
      prototype, of 32x32 quantization.
      
      This fixes issue 607 from webm@googlecode.
      
      Change-Id: I85635e6ca236b90c3dcfc40d449215c7b9caa806
      abff6788
  9. 19 Aug, 2013 - 1 commit
  10. 15 Aug, 2013 - 1 commit
  11. 12 Aug, 2013 - 1 commit
  12. 09 Aug, 2013 - 1 commit
  13. 16 Jul, 2013 - 1 commit
    • Ronald S. Bultje's avatar
      Inline vp9_quantize() in xform_quant(). · 1ff94fea
      Ronald S. Bultje authored
      Cycle times:
      4x4:    151 to  131 cycles (15% faster)
      8x8:    334 to  306 cycles (9% faster)
      16x16: 1401 to 1368 cycles (2.5% faster)
      32x32: 7403 to 7367 cycles (0.5% faster)
      
      Total encode time of first 50 frames of bus @ 1500kbps (speed 0)
      goes from 1min39.2 to 1min38.6, i.e. a 0.67% overall speedup.
      
      Change-Id: I799a49460e5e3fcab01725564dd49c629bfe935f
      1ff94fea
  14. 11 Jul, 2013 - 1 commit
    • Dmitry Kovalev's avatar
      Moving segmentation related vars into separate struct. · c4ad3273
      Dmitry Kovalev authored
      Adding segmentation struct to vp9_seg_common.h. Struct members are from
      macroblockd and VP9Common structs. Moving segmentation related constants
      and enums to vp9_seg_common.h.
      
      Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03
      c4ad3273
  15. 01 Jul, 2013 - 2 commits
    • Ronald S. Bultje's avatar
      Update quantize SSSE3 SIMD to cover 32x32 transform case also. · c8defcfd
      Ronald S. Bultje authored
      Encode time of bus (speed 0) 50 frames @ 1500kbps goes from 2min14.4 to
      2min10.1, i.e. a 2.3% overall speed increase.
      
      Change-Id: I3699580e74ec26c7d24e03681bc47ba25ee1ee87
      c8defcfd
    • Ronald S. Bultje's avatar
      Quantize (64-bit only, for now) SSSE3 SIMD. · 7353ceab
      Ronald S. Bultje authored
      Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps
      goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is
      x86-64 only, it needs some minor modifications to be 32bit compatible,
      because it uses 15 xmm registers, whereas 32bit only has 8.
      
      Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904
      7353ceab
  16. 28 Jun, 2013 - 1 commit
    • Ronald S. Bultje's avatar
      Make coefficient skip condition an explicit RD choice. · af660715
      Ronald S. Bultje authored
      This commit replaces zrun_zbin_boost, a method of biasing non-zero
      coefficients following runs of zero-coefficients to be rounded towards
      zero, with an explicit skip-block choice in the RD loop.
      
      The logic is basically that if individual coefficients should be rounded
      towards zero (from a RD point of view), the trellis/optimize loop should
      take care of it. If whole blocks should be zero (from a RD point of
      view), a single RD check is much more efficient than a complete
      serialization of the quantization loop.
      
      Quality change: derf +0.5% psnr, +1.6% ssim; yt +0.6% psnr, +1.1% ssim.
      SIMD for quantize will follow in a separate patch. Results for other
      test sets pending.
      
      Change-Id: Ife5fa641163ac5150ac428011e87188f1937c1f4
      af660715
  17. 27 Jun, 2013 - 1 commit
  18. 19 Jun, 2013 - 1 commit
    • Yunqing Wang's avatar
      Add two-pass quantization · b5bf7b13
      Yunqing Wang authored
      Optimized the quantization function by making it a two-pass
      process. The first pass does a quick checking of the transform
      coefficients against the base ZBIN, and only keep the good
      enough set of coefficients for quantization. A skipping
      check is added. If all coefficients are within the base ZBIN, no
      quantization is needed. The second pass is the actual quantization
      pass, which only processes the coefficient subset determined
      in first pass. This reduces the computation. Furthermore, an
      alternitive method is used for large transform size, which often
      has sparse nonzero quantized coefficients.
      
      Overall, the encoder speedup is about 4%. The quantization function
      itself gets 20% faster.
      
      Change-Id: I3a9dd0da6db030260b6d9c314a9fa48ecae89f22
      b5bf7b13
  19. 23 May, 2013 - 1 commit
  20. 17 May, 2013 - 1 commit
    • John Koleszar's avatar
      Initial version of alpha channel support · 679e4abd
      John Koleszar authored
      This is a mostly-working implementation of an extra channel in the
      bitstream. Configure with --enable-alpha to test. Notable TODOs:
      
       - Add extra channel to all mismatch tests, PSNR, SSIM, etc
       - Configurable subsampling
       - Variable number of planes (currently always uses all 4)
       - Loop filtering
       - Per-plane lossless quantizer
       - ARNR support
      
      This implementation just uses the same contents as the Y channel
      for the A channel, due to lack of content and general pain in
      playing back 4 channel content. A later patch will use the actual
      alpha channel passed in from outside the codec.
      
      Change-Id: Ibf81f023b1c570bd84b3064e9b4b8ae52e087592
      679e4abd
  21. 07 May, 2013 - 3 commits
  22. 03 May, 2013 - 1 commit
    • John Koleszar's avatar
      Separate transform and quant from vp9_encode_sb · 4529c68b
      John Koleszar authored
      This allows removing a large number of transform size specific functions,
      as well as supporting 444/alpha by routing all code through the
      subsampling-aware path.
      
      Change-Id: Ieb085cebe9f37f24fc24de179898b22abfda08a4
      4529c68b
  23. 02 May, 2013 - 1 commit
    • John Koleszar's avatar
      Create common vp9_encode_sb{,y} · 3f4e8063
      John Koleszar authored
      Creates a common encode (subtract, transform, quantize, optimize,
      inverse transform, reconstruct) function for all sb sizes, including
      the old 16x16 path.
      
      Change-Id: I964dff1ea7a0a5c378046a069ad83495f54df007
      3f4e8063
  24. 01 May, 2013 - 1 commit
  25. 30 Apr, 2013 - 2 commits
    • Ronald S. Bultje's avatar
      sb8x8 integration in rd loop. · d068d869
      Ronald S. Bultje authored
      Work-in-progress, not yet ready for review. TODO items:
      - bitstream writing (encoder) and reading (decoder)
      - decoder reconstruction
      
      Change-Id: I5afb7284e7e0480847b47cd0097cb469433c9081
      d068d869
    • Dmitry Kovalev's avatar
      Adding vp9_get_qindex function. · 3f6c6ffc
      Dmitry Kovalev authored
      Moving common code from encoder and decoder to vp9_get_qindex function.
      Also moving quant-related constants from vp9_onyxc_int.h to
      vp9_quant_common.h.
      
      Change-Id: I70c5bfbaa1c8bf00fde0bfc459d077f88b6d46c8
      3f6c6ffc
  26. 26 Apr, 2013 - 1 commit
  27. 25 Apr, 2013 - 3 commits
  28. 24 Apr, 2013 - 2 commits
  29. 23 Apr, 2013 - 1 commit
    • John Koleszar's avatar
      Convert coeff to per-plane MACROBLOCK data · 138ec38c
      John Koleszar authored
      This commit moves the coeff storage from the MACROBLOCK struct to its
      per-plane part. The next commit will remove the coeff member from the
      BLOCK structure so that it is consistently accessed per-plane.
      
      Also refactors vp9_sb_block_error_c and vp9_sb_uv_block_error_c to be
      variable subsampling aware.
      
      Change-Id: I18c30f87f27c3a012119b6c1970d5fa499804455
      138ec38c
  30. 22 Apr, 2013 - 2 commits
  31. 16 Apr, 2013 - 1 commit
  32. 15 Apr, 2013 - 1 commit