1. 13 Nov, 2017 1 commit
    • paulwilkins's avatar
      New content type to improve grain retention. · a73cee28
      paulwilkins authored
      For new VP9 only content type adjust  the rate distortion and ARF
      filter based on the relative spatial variance of the source and
      In regards to the RD loop the method favors modes where the
      reconstruction variance is similar to the source variance. However it
      is currently only applied to regions where the source variance is quite
      For very low variance blocks it applies a further bias against intra
      coding and large prediction block sizes (the later in particular limit
      the usefulness of the loop filter).
      The final part of this change is to lower the strength of the ARF
      filter for blocks where the source has very low spatial variance, to
      encourage some low amplitude texture or noise to pass through
      the filter.
      This change improves the retention of film grain and fine noise /
      texture in spatially flat regions, but as expected causes a significant
      drop in PSNR on many clips. This is to be expected because similar
      but misaligned noise or texture will give a lower PSNR than a flat
      noise free reconstruction. However, it is worth noting that most clips
      show a strong gain in FAST SSIM.
      The features are enabled on the vpxenc command line by setting
      VPX_ENCODER_ABI_VERSION bumped for this change and cvbr.
      Change-Id: I26a4e4edfa3dc5cacead82fa701fe7a9118ccd0a
  2. 08 Sep, 2017 1 commit
    • paulwilkins's avatar
      Fix bug in intra mode rd penalty. · 0657f473
      paulwilkins authored
      The intra mode rd penalty was implemented as a rate penalty.
      Code was added to scale the penalty according to block size but
      this was not done correctly for the SB level or sub 8x8.
      The code did a weird double scaling in regard to bit depth that
      has been removed. Given that it is a rate penalty the bit depth
      should not matter.
      This bug fix improves average metrics  on our standard test
      sets by about 0.1%
      Change-Id: I7cf81b66aad0cda389fe234f47beba01c7493b1e
  3. 05 Sep, 2017 1 commit
  4. 21 Aug, 2017 1 commit
    • Johann's avatar
      Remove skip_block from quantize · 13eed991
      Johann authored
      This condition is handled before this code is reached. The ssse3 version
      of the function has always crashed when attempting to handle the
      skip_block condition.
      Add assert() and comments regarding the usage of skip_block.
      Removing the parameter is a fairly involved process so leave it be for
      the moment.
      Change-Id: Ib299f6fc6589d7ee102262cc74a7aeb60110bc5a
  5. 07 Jul, 2017 1 commit
  6. 29 Jun, 2017 1 commit
  7. 03 May, 2017 2 commits
  8. 25 Apr, 2017 2 commits
  9. 24 Apr, 2017 1 commit
    • Yunqing Wang's avatar
      Make the row based multi-threaded encoder deterministic · 10a497bd
      Yunqing Wang authored
      This patch followed allow_exhaustive_searches feature modification and
      continued to modify the encoder to achieve the determinism in the row
      based multi-threaded encoding. While row-mt = 1 and using multiple
      threads, the adaptive feature in encoder was disabled, which gave
      BDRate gain(at speed 1, -0.6% ~ -0.7%; at speed 2, -0.46% ~ -0.59%),
      but some encoder speed losses(7% ~ 10% at speed 1 and 3% ~ 6% at
      speed 2). These speed losses were acceptable considering the speed
      gains obtained from row-mt.
      Change-Id: I60d87a25346ebc487a864b57d559f560b7e398bb
  10. 19 Apr, 2017 1 commit
    • Linfeng Zhang's avatar
      Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve · bf8a49ab
      Linfeng Zhang authored
      The rule is: if a short ptr is casted to a byte ptr, any offset
      operation on the byte ptr must be doubled. We do this by casting to
      short ptr first, adding offset, then casting back to byte ptr.
      Change-Id: I9e18a73ba45ddae58fc9dae470c0ff34951fe248
  11. 06 Apr, 2017 1 commit
    • Yunqing Wang's avatar
      VP9 motion vector unit test · 1aa46abb
      Yunqing Wang authored
      To prevent the motion vector out of range bug, added a motion vector unit
      test in VP9. In the 4k video encoding, always forced to use extreme motion
      vectors and also encouraged to use INTER modes. In the decoding, checked if
      the motion vector was valid, and also checked the encoder/decoder mismatch.
      The tests showed that this unit test could reveal the issue we saw before.
      Change-Id: I0a880bd847dad8a13f7fd2012faf6868b02fa3b4
  12. 22 Mar, 2017 1 commit
  13. 20 Mar, 2017 1 commit
    • Yunqing Wang's avatar
      Record the sum of tx block eobs in the partition block · 9c2552a1
      Yunqing Wang authored
      The sum of tx bloxk eobs is needed in the machine learning based partition
      early termination. The eobs are first accumulated during tx search, and
      then the value associated with the best tx_size is copied to ctx for later
      After the sum of eobs are calculated correctly, re-enabled
      ml_partition_search_early_termination speed feature.
      Re-did the quality/speed test to check the impact of the fix.
      1. Borg test BDRATE result:
      4k set:     PSNR: +0.183%; SSIM: +0.100%;
      hdres set:  PSNR: +0.168%; SSIM: +0.256%;
      midres set: PSNR: +0.186%; SSIM: +0.326%;
      2.Average speed gain result:
      4k clips: 21%;
      hd clips: 26%;
      midres clips: 15%.
      The result is in line with the original result.
      Change-Id: I4209a95c89be03b4cbfb6a95b16885f89feddbda
  14. 16 Mar, 2017 1 commit
    • Gabriel Marin's avatar
      Add a vector form of routine vp9_model_rd_from_var_lapndz · 976ddb61
      Gabriel Marin authored
      Add routine vp9_model_rd_from_var_lapndz_vec and call it from model_rd_for_sb
      to model the rate and distortion for MAX_MB_PLANE Laplacian sources in
      parallel. The caller ensures that all sources have non-zero variance.
      Measured a 18% to 25% reduction in retired instructions, and 17% to 24%
      reduction in instruction execution cost with different compilers for the
      Laplacian modeling.
      No change in behavior.
      TEST=Verified that encoded files match bit for bit, with and without this
      Change-Id: I6b76947f21c659a349adb896e13e99f6e3f951e6
  15. 03 Mar, 2017 1 commit
  16. 27 Feb, 2017 1 commit
    • Vignesh Venkatasubramanian's avatar
      vp9: Rename new_mt to row_mt · 58816014
      Vignesh Venkatasubramanian authored
      new_mt is a very generic name that will get obsolete soon enough.
      Since this is exposed as a codec control, renaming it to row_mt to
      signify row level paralellism. Also renaming the ETHREAD_BIT_MATCH
      codec control to ROW_MT_BIT_EXACT.
      Change-Id: Ic7872d78bb3b12fb4cf92ba028ec8e08eb3a9558
  17. 24 Feb, 2017 1 commit
    • Johann's avatar
      consolidate block_error functions · 904b957a
      Johann authored
      vp9_highbd_block_error_8bit_c was a very simple wrapper around
      vp9_block_error_c. The SSE2 implemention was practically identical to
      the non-HBD one. It was missing some minor improvements which only
      went into the original version.
      In quick speed tests, the AVX implementation showed minimal
      improvement over SSE2 when it does not detect overflow. However, when
      overflow is detected the function is run a second time. The
      OperationCheck test seems to trigger this case and reverses any
      speed benefits by running ~60% slower. AVX2 on the other hand is
      always 30-40% faster.
      Change-Id: I9fcb9afbcb560f234c7ae1b13ddb69eca3988ba1
  18. 16 Feb, 2017 1 commit
  19. 15 Feb, 2017 1 commit
    • Ranjit Kumar Tulabandu's avatar
      Row based multi-threading of encoding stage · 71061e93
      Ranjit Kumar Tulabandu authored
      (Yunqing Wang)
      This patch implements the row-based multi-threading within tiles in
      the encoding pass, and substantially speeds up the multi-threaded
      encoder in VP9.
      Speed tests at speed 1 on STDHD(using 4 tiles) set show that the
      average speedups of the encoding pass(second pass in the 2-pass
      encoding) is 7% while using 2 threads, 16% while using 4 threads,
      85% while using 8 threads, and 116% while using 16 threads.
      Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de
  20. 01 Feb, 2017 3 commits
    • Ranjit Kumar Tulabandu's avatar
      Changes to facilitate row based multi-threading of ARNR filtering · 359a6796
      Ranjit Kumar Tulabandu authored
      Change-Id: I2fd72af00afbbeb903e4fe364611abcc148f2fbb
    • Johann's avatar
      vp9_rdopt: declare 'c' closer to use · bfd62cda
      Johann authored
      Clears up static clang analysis warning regarding a dead store. Only
      declare 'c' when it will be used.
      Change-Id: I1ac0fc7f94bc44da63938c63cd1efcd6b95e0eb3
    • Jingning Han's avatar
      Fix real-time compression regression in hbd mode · 969957f9
      Jingning Han authored
      This commit resolves the compression performance regression in
      real-time encoding setting when high bit-depth mode is enabled.
      The current solution temporarily disables the SIMD implementations
      of vpx_satd, hadamard8x8, and hadamard16x16 in high bit-depth mode.
      The commit makes the coding results bit-wise identical between
      regular coding pipeline and high bit-depth at profile 0.
      Change-Id: Icfb900821733749685370460a1a5a7e07f76f4bf
  21. 31 Aug, 2016 1 commit
  22. 25 Aug, 2016 1 commit
    • paulwilkins's avatar
      Adjust coefficient optimization and tx_domain rd speed features. · 635ae8bd
      paulwilkins authored
      Previously Tx domain rd was used in all cases above speed 0.
      Coefficient optimization was only enabled for best and speed 0.
      This patch selectively sets these features at other speed settings
      based on block complexity.
      For the Netflix and HD sets in particular the quality gains are
      large compared to the speed hit. At speed 1 the average psnr
      gain in the NF set  is > 2.5% with one clip coming in at 18%
      and some points almost 30%.  Average gains for the lower
      resolution test sets are around 1%.
      The gains are biggest at low Q so some further optimization
      may be possible.
      Change-Id: I340376c7b2a78e5389a34b7ebdc41072808d0576
  23. 12 Aug, 2016 1 commit
    • Yunqing Wang's avatar
      Fix another motion vector out of range bug · a413dbe5
      Yunqing Wang authored
      This patch fixed a motion vector out of range bug:
      vpxenc: ../libvpx/vp9/encoder/vp9_mcomp.c:69:
       mv_cost: Assertion `mv->col >= -((1 << (11 + 1 + 2)) - 1) &&
       mv->col < ((1 << (11 + 1 + 2)) - 1)' failed.
      For blocks that returned without having full-pixel search, the original
      MV limits were not restored, which caused the failure. Moved the set
      MV limit function down to fix the bug.
      Change-Id: Id7d798fc7214e95c6e4846c588f0233fcf1a4223
  24. 08 Aug, 2016 1 commit
  25. 05 Aug, 2016 1 commit
    • Yunqing Wang's avatar
      Fix a motion vector out of range bug · 2fb826c4
      Yunqing Wang authored
      This patch fixed a motion vector(MV) out of range bug, which was caused
      by not restoring the original values of the MV min/max thresholds after
      the sub8x8 full pixel motion search. It occurred rarely and only was seen
      while encoding a 4k clip for 200 frames.
      Change-Id: Ibc4e0de80846f297431923cef8a0c80fe8dcc6a5
  26. 04 Aug, 2016 1 commit
    • Yaowu Xu's avatar
      Fix msvc compiler warnings · 7a79fa13
      Yaowu Xu authored
      MSVC 2013 complained about using 32 shift where 64 bit shift should be
      Change-Id: I7a2b165d1a92d3c0a91dd4511b27aba7709b5e55
  27. 02 Aug, 2016 1 commit
  28. 27 Jul, 2016 1 commit
  29. 25 Jul, 2016 1 commit
  30. 21 Jul, 2016 1 commit
  31. 09 Jul, 2016 1 commit
  32. 07 Jul, 2016 4 commits
    • Jingning Han's avatar
      Enable coeff optimization for intra modes · 2f28f907
      Jingning Han authored
      This further improves the coding performance by
      lowres 0.3%
      midres 0.5%
      hdres  0.6%
      Change-Id: I6a03b6da210b9cbc261474bad4a103e0ba021c68
    • Jingning Han's avatar
      Enable uniform quantization with trellis optimization in speed 0 · 62aa642d
      Jingning Han authored
      This commit allows the inter prediction residual to use uniform
      quantization followed by trellis coefficient optimization in
      speed 0. It improves the coding performance by
      lowres 0.79%
      midres 1.07%
      hdres  1.44%
      Change-Id: I46ef8cfe042a4ccc7a0055515012cd6cbf5c9619
    • Jingning Han's avatar
      Refactor coeff_cost() function · 541eb789
      Jingning Han authored
      Move the operations that update the context buffers outside this
      function. The coeff_cost() takes all input as const value and returns
      the coefficient cost.
      This makes preparation for the next coefficient optimization CLs.
      Change-Id: I850eec6e5470b91ea84646ff26b9231b09f70a0c
    • Jingning Han's avatar
      Support measure distortion in the pixel domain · e357b9ef
      Jingning Han authored
      Use pixel domain distortion metric in speed 0. This improves the
      compression performance by 0.3% for both low and high resolution
      test sets.
      Change-Id: I5b5b7115960de73f0b5e5d0c69db305e490e6f1d
  33. 05 Jul, 2016 1 commit
    • Jingning Han's avatar
      Remove txfrm_block_to_raster_xy() from vp9 encoder · 14011f03
      Jingning Han authored
      The transform block row and column positions are always available
      outside the callees. There is no need to re-compute these values
      again. This approach has been used by the decoder. This commit
      removes txfrm_block_to_raster_xy() function.
      Change-Id: I5b90f91a0d8b7c35cfa7d171da9edf8202630108