1. 06 Jul, 2017 1 commit
    • Marco's avatar
      vp9: Nonrd mode: use content_state_sb for high motion. · 8c3f18ef
      Marco authored
      In the content_state for a superblock is set to HighSad,
      use that to bias some decisions in variance partition and
      nonrd pickmde: use int_pro_motion for sad computation in
      choose_partitioning, and set large_block in pickmode based
      on the content_state_sb.
      Only affects speed >= 7.
      Immprovement for high motion content.
      Small gain (~1%) in RTC metrics.
      Speedup of ~5 for high motion clip on android (speed 8, 1 thread).
      Change-Id: I5774c4854f012b89c8e969f6129b60988c2ce11c
  2. 29 Jun, 2017 1 commit
  3. 27 Jun, 2017 1 commit
  4. 22 Jun, 2017 3 commits
  5. 18 May, 2017 2 commits
  6. 11 May, 2017 1 commit
  7. 25 Apr, 2017 2 commits
    • Jerome Jiang's avatar
      vp9: speed >= 8: Skip uv variance in model_rd_sb_y_large · 69b0242e
      Jerome Jiang authored
      For speed >= 8 and color_sensitivity not set, skip the transform
      skipping test in UV planes.
      Add a new condition to check noise level to skip chroma check
      for speed >= 8 if y_sad is high.
      1~2% speedup on ARM for speed 8.
      Borg tests show neutral results in both rtc and rtc_derf.
      Change-Id: Idecd3ff6e28c97757a43bb6f3a7082c85f72109c
    • Marco's avatar
      vp9; Reduce artifact in non-rd pickmode for lighting changes. · 92ec0674
      Marco authored
      Add a low-variance high-sumdiff to the superblock content state
      and use it to limit the mv and bias some decisions in non-rd pickmode.
      Only affects speed >= 6.
      Reduces artifact for lighting changes.
      Small/no difference in metrics on RTC set.
      Change-Id: Ic84b2379fe0ae3fa71ae826ee6bae3eaf551a25b
  8. 24 Apr, 2017 1 commit
    • Yunqing Wang's avatar
      Make the row based multi-threaded encoder deterministic · 10a497bd
      Yunqing Wang authored
      This patch followed allow_exhaustive_searches feature modification and
      continued to modify the encoder to achieve the determinism in the row
      based multi-threaded encoding. While row-mt = 1 and using multiple
      threads, the adaptive feature in encoder was disabled, which gave
      BDRate gain(at speed 1, -0.6% ~ -0.7%; at speed 2, -0.46% ~ -0.59%),
      but some encoder speed losses(7% ~ 10% at speed 1 and 3% ~ 6% at
      speed 2). These speed losses were acceptable considering the speed
      gains obtained from row-mt.
      Change-Id: I60d87a25346ebc487a864b57d559f560b7e398bb
  9. 21 Apr, 2017 1 commit
    • Yunqing Wang's avatar
      Make allow_exhaustive_searches feature no longer adaptive · bca45646
      Yunqing Wang authored
      A previous patch turned on allow_exhaustive_searches feature only for
      FC_GRAPHICS_ANIMATION content. This patch further modified the feature
      by removing the exhaustive search limit, and made it no longer adaptive.
      As a result, the 2 counts that recorded the number of motion searches
      were removed, which helped achieve the determinism in the row based
      multi-threading encoding. Tests showed that this patch didn't cause
      the encoder much slower.
      Used exhaustive_searches_thresh for this speed feature, and removed
      allow_exhaustive_searches. Also, refactored the speed feature code
      to follow the general speed feature setting style.
      Change-Id: Ib96b182c4c8dfff4c1ab91d2497cc42bb9e5a4aa
  10. 20 Apr, 2017 1 commit
  11. 11 Apr, 2017 1 commit
    • Jerome Jiang's avatar
      vp9: speed >= 8: Adjust speed settings on ARM. · f16f08e5
      Jerome Jiang authored
      Set adaptive_rd_thresh to 2 when simple block yrd is not used.
      Fix regression caused by computing y sad without
      int_pro_motion_estimation on low res motion clips.
      Overall 0.07% quality loss on rtc_derf.
      Change only affects low res on speed 8.
      Change-Id: Ic6a188a56529f1034d6431005fb4b0e24e8a7e27
  12. 10 Apr, 2017 1 commit
    • Marco's avatar
      vp9: 1 pass CBR: avoid nonrd_pick_partition on segment. · 6557baf3
      Marco authored
      For speed 5, 1 pass CBR: Don't use the nonrd_pick_partition
      on the segment, rather use choose_partitioning followed by
      nonrd_select_partition (as is done on base segment).
      Little/no quality loss on RTC and RTC_derf (< 0.3%),
      speedup of at least 5%.
      Change-Id: I5273d5f950e60adf5e437b4ca8c4f63964641e83
  13. 06 Apr, 2017 2 commits
  14. 05 Apr, 2017 1 commit
    • Marco's avatar
      vp9: Temporal denoising: avoid denoising for speed <= 5. · 2136de93
      Marco authored
      Temporal denoiser runs in non-rd pickmode, so it is only used
      for speed >= 5. Regression exists for speed 5, due to use of
      reference_partition (which use non-rd pickmode for partitioning).
      Avoid denoising for now at speed 5.
      Change-Id: I74a74d2e1404d7cfd33dcf4ec06dd2e503256cf0
  15. 31 Mar, 2017 1 commit
    • Yunqing Wang's avatar
      Enhance the row mt sync read to accept the sync_range greater than 1 · f1600db3
      Yunqing Wang authored
      The row mt sync read uses sync_range = 1, and wouldn't work if we want
      to use a sync_range that is greater than 1. To make it work, this sync
      read code is modified. Pass in col instead of col - 1 to make it
      consistent with other row mt code in VP9, and then add 1 in "while"
      Change-Id: I4a0e487190ac5d47b8216368da12d80fec779c1a
  16. 27 Mar, 2017 2 commits
    • Marco's avatar
      vp9: Speed >= 8: avoid chrome check under some condition. · 0169a985
      Marco authored
      For non-rd variance partition, avoid the chrome check
      unless y_sad is below some threshold.
      Small decrease in avgPSNR (~0.3) on RTC set.
      Small/negligible decrease on RTC_derf.
      Change-Id: I7af44235af514058ccf9a4f10bb737da9d720866
    • Marco's avatar
      vp9: 1 pass: Move source sad computation into encodeframe loop. · 66c6b4d6
      Marco authored
      Refactor to split the 1 passs source sad computation into scene
      detection (currently used for VBR and screen-content mode), and
      superblock based source sad computation (used in non-rd CBR mode).
      This allows the source sad computation for CBR mode to be
      No change in compression.
      Change-Id: I112f2918613ccbd37c1771d852606d3af18c1388
  17. 24 Mar, 2017 1 commit
  18. 23 Mar, 2017 1 commit
    • Marco's avatar
      vp9: Non-rd partition: avoid unneeded call to chrome_check · 4863e07c
      Marco authored
      Since y_sad is not computed yet (on the early exit due to source_sad),
      no need to check for setting color_sensitiviy.
      Only affects speed >=8. No change in behavior.
      Change-Id: I3a6f2d20fed38d8b8ec51b75bcacf9a21f2db916
  19. 22 Mar, 2017 1 commit
  20. 21 Mar, 2017 1 commit
  21. 20 Mar, 2017 3 commits
    • Marco's avatar
      vp9: Nonrd variance partition: improve split to 16x16. · 3135b854
      Marco authored
      Add additional condition to split to 16x16, for resolutions <= 360p,
      reduces dragging artifact near moving boundary.
      Small/no change on RTC metrics.
      Change-Id: I314694f2166435d918f74e7ab42f002b07f40dae
    • Marco's avatar
      vp9: Use sb content measure to bias against golden. · 06c8713e
      Marco authored
      For each superblock, keep track of how far from current frame
      was the last significant content change, and use that (along
      with GF distance), to turnoff GF search in non-rd pickmode.
      Only enabled for speed >= 8.
      avgPNSR on RTC/RTC_derf down by ~0.9/1.2.
      Speedup on mac: ~3-5%.
      Speedup on arm: 3.6% for VGA and 4.4% for HD.
      Change-Id: Ic3f3d6a2af650aca6ba0064d2b1db8d48c035ac7
    • Yunqing Wang's avatar
      Record the sum of tx block eobs in the partition block · 9c2552a1
      Yunqing Wang authored
      The sum of tx bloxk eobs is needed in the machine learning based partition
      early termination. The eobs are first accumulated during tx search, and
      then the value associated with the best tx_size is copied to ctx for later
      After the sum of eobs are calculated correctly, re-enabled
      ml_partition_search_early_termination speed feature.
      Re-did the quality/speed test to check the impact of the fix.
      1. Borg test BDRATE result:
      4k set:     PSNR: +0.183%; SSIM: +0.100%;
      hdres set:  PSNR: +0.168%; SSIM: +0.256%;
      midres set: PSNR: +0.186%; SSIM: +0.326%;
      2.Average speed gain result:
      4k clips: 21%;
      hd clips: 26%;
      midres clips: 15%.
      The result is in line with the original result.
      Change-Id: I4209a95c89be03b4cbfb6a95b16885f89feddbda
  22. 13 Mar, 2017 1 commit
    • Yunqing Wang's avatar
      Apply machine learning-based early termination in VP9 partition search · 67010143
      Yunqing Wang authored
      This patch was based on Yang Xian's intern project code. Further modifications
      were done.
      1. Moved machine-learning related parameters into the context structure.
      2. Corrected the calculation of sum_eobs.
      3. Removed unused parameters and calculations.
      4. Made it work with multiple tiles.
      5. Added a speed feature for the machine-learning based partition search
      early termination.
      6. Re-organized the code.
      The patch was rebased to the top-of-tree.
      Borg test BDRATE result:
      4k set:     PSNR: +0.144%; SSIM: +0.043%;
      hdres set:  PSNR: +0.149%; SSIM: +0.269%;
      midres set: PSNR: +0.127%; SSIM: +0.257%;
      Average speed gain result:
      4k clips: 22%;
      hd clips: 23%;
      midres clips: 15%.
      Change-Id: I0220e93a8277e6a7ea4b2c34b605966e3b1584ac
  23. 08 Mar, 2017 1 commit
    • Yunqing Wang's avatar
      Make the partition search early termination feature to be frame size dependent · 099e9bf1
      Yunqing Wang authored
      The 2 thresholds(i.e. partition_search_breakout_dist_thr and
      partition_search_breakout_rate_thr) are used as the partition search
      early termination speed feature. This refactoring patch made this
      feature to be frame size dependent consistently throughout the code.
      Change-Id: Idaa0bd8400badaa0f8e2091e3f41ed2544e71be9
  24. 02 Mar, 2017 1 commit
  25. 27 Feb, 2017 2 commits
    • Marco's avatar
      vp9: Fix an issue with setting variance thresholds. · defe094e
      Marco authored
      From commit:
      On non-segment the set_vbp_thresholds() should be called
      again to adjust thresholds based on content_state of superblock.
      This was the intended behavior from 441393.
      Small change in RTC metrics and speed.
      Change-Id: I45e5fbdc4af74db76b3cb4f13074fcae0eb2219e
    • Vignesh Venkatasubramanian's avatar
      vp9: Rename new_mt to row_mt · 58816014
      Vignesh Venkatasubramanian authored
      new_mt is a very generic name that will get obsolete soon enough.
      Since this is exposed as a codec control, renaming it to row_mt to
      signify row level paralellism. Also renaming the ETHREAD_BIT_MATCH
      codec control to ROW_MT_BIT_EXACT.
      Change-Id: Ic7872d78bb3b12fb4cf92ba028ec8e08eb3a9558
  26. 22 Feb, 2017 2 commits
  27. 15 Feb, 2017 1 commit
    • Ranjit Kumar Tulabandu's avatar
      Row based multi-threading of encoding stage · 71061e93
      Ranjit Kumar Tulabandu authored
      (Yunqing Wang)
      This patch implements the row-based multi-threading within tiles in
      the encoding pass, and substantially speeds up the multi-threaded
      encoder in VP9.
      Speed tests at speed 1 on STDHD(using 4 tiles) set show that the
      average speedups of the encoding pass(second pass in the 2-pass
      encoding) is 7% while using 2 threads, 16% while using 4 threads,
      85% while using 8 threads, and 116% while using 16 threads.
      Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de
  28. 07 Feb, 2017 2 commits
  29. 02 Feb, 2017 1 commit