1. 29 Jul, 2017 3 commits
    • James Zern's avatar
      Revert "vp9: Speed feature to adapt partition based on source_sad." · c9266b85
      James Zern authored
      This reverts commit 064fc570.
      
      This causes an assertion failure in vp9_mcomp.c when running
      gtest_filter=VP9/MotionVectorTestLarge.OverallTest/41:
      `mv->col >= -((1 << (11 + 1 + 2)) - 1) && mv->col < ((1 << (11 + 1 + 2))
      - 1)'
      
      Change-Id: I449e777bf18b661cb3f1d82253610c55c51687f6
      c9266b85
    • Jerome Jiang's avatar
      vp9: Adjust logic in source sad for screen content. · ac211fe2
      Jerome Jiang authored
      Change-Id: I917d106f4c95ea44e413e23881f6303982e1a6a3
      ac211fe2
    • Marco's avatar
      vp9: Speed feature to adapt partition based on source_sad. · 064fc570
      Marco authored
      Move the source_sad feature to speed 6 (from speed 7), and
      add speed feature to switch from the variance-based partition
      to reference_partition (which uses nonrd-pickmode for bsize selection)
      if source_sad is high.
      
      Currently used only for speed 6 for resoln <= 360p.
      About 4-5% improvement on 360p in RTC set.
      Some speed slowdown, but still ~30% faster than speed 5.
      
      Change-Id: Ib0330ee5fe9fdd2608aed91359a2a339d967491c
      064fc570
  2. 22 Jul, 2017 1 commit
  3. 17 Jul, 2017 2 commits
    • Marco's avatar
      vp9: Fix to setting content_state for real-time mode. · ad563713
      Marco authored
      When content_state_sb is set to LowVarHighSumdiff, don't reset
      it to VeryHighSad. Visually better on clips with strong lighting changes.
      
      Small/negligible change in RTC metrics and speed.
      
      Change-Id: I20c383e3c4cf8d1149de5f9260449c0b7cf7c6aa
      ad563713
    • Marco's avatar
      vp9: Reuse motion from choose_partitioning in NEWMV search. · 0c9e2f4c
      Marco authored
      When int_pro_motion_estimation is done for superblock in
      choose_partitioning, use it to avoid the full_pixel_search
      for NEWMV mode, if bsize is >= 32X32.
      
      For speed > 7.
      Small/neutral change on RTC metrics.
      ~1-2% speedup on arm on high motion clip.
      
      Change-Id: I3cfe6833ff4bf75d4afa83eaf058ad45729de85b
      0c9e2f4c
  4. 14 Jul, 2017 1 commit
  5. 11 Jul, 2017 1 commit
  6. 07 Jul, 2017 1 commit
  7. 06 Jul, 2017 1 commit
    • Marco's avatar
      vp9: Nonrd mode: use content_state_sb for high motion. · 8c3f18ef
      Marco authored
      In the content_state for a superblock is set to HighSad,
      use that to bias some decisions in variance partition and
      nonrd pickmde: use int_pro_motion for sad computation in
      choose_partitioning, and set large_block in pickmode based
      on the content_state_sb.
      
      Only affects speed >= 7.
      
      Immprovement for high motion content.
      Small gain (~1%) in RTC metrics.
      Speedup of ~5 for high motion clip on android (speed 8, 1 thread).
      
      Change-Id: I5774c4854f012b89c8e969f6129b60988c2ce11c
      8c3f18ef
  8. 29 Jun, 2017 1 commit
  9. 27 Jun, 2017 1 commit
  10. 22 Jun, 2017 3 commits
  11. 18 May, 2017 2 commits
  12. 11 May, 2017 1 commit
  13. 25 Apr, 2017 2 commits
    • Jerome Jiang's avatar
      vp9: speed >= 8: Skip uv variance in model_rd_sb_y_large · 69b0242e
      Jerome Jiang authored
      For speed >= 8 and color_sensitivity not set, skip the transform
      skipping test in UV planes.
      Add a new condition to check noise level to skip chroma check
      for speed >= 8 if y_sad is high.
      
      1~2% speedup on ARM for speed 8.
      
      Borg tests show neutral results in both rtc and rtc_derf.
      
      Change-Id: Idecd3ff6e28c97757a43bb6f3a7082c85f72109c
      69b0242e
    • Marco's avatar
      vp9; Reduce artifact in non-rd pickmode for lighting changes. · 92ec0674
      Marco authored
      Add a low-variance high-sumdiff to the superblock content state
      and use it to limit the mv and bias some decisions in non-rd pickmode.
      Only affects speed >= 6.
      
      Reduces artifact for lighting changes.
      Small/no difference in metrics on RTC set.
      
      Change-Id: Ic84b2379fe0ae3fa71ae826ee6bae3eaf551a25b
      92ec0674
  14. 24 Apr, 2017 1 commit
    • Yunqing Wang's avatar
      Make the row based multi-threaded encoder deterministic · 10a497bd
      Yunqing Wang authored
      This patch followed allow_exhaustive_searches feature modification and
      continued to modify the encoder to achieve the determinism in the row
      based multi-threaded encoding. While row-mt = 1 and using multiple
      threads, the adaptive feature in encoder was disabled, which gave
      BDRate gain(at speed 1, -0.6% ~ -0.7%; at speed 2, -0.46% ~ -0.59%),
      but some encoder speed losses(7% ~ 10% at speed 1 and 3% ~ 6% at
      speed 2). These speed losses were acceptable considering the speed
      gains obtained from row-mt.
      
      Change-Id: I60d87a25346ebc487a864b57d559f560b7e398bb
      10a497bd
  15. 21 Apr, 2017 1 commit
    • Yunqing Wang's avatar
      Make allow_exhaustive_searches feature no longer adaptive · bca45646
      Yunqing Wang authored
      A previous patch turned on allow_exhaustive_searches feature only for
      FC_GRAPHICS_ANIMATION content. This patch further modified the feature
      by removing the exhaustive search limit, and made it no longer adaptive.
      As a result, the 2 counts that recorded the number of motion searches
      were removed, which helped achieve the determinism in the row based
      multi-threading encoding. Tests showed that this patch didn't cause
      the encoder much slower.
      
      Used exhaustive_searches_thresh for this speed feature, and removed
      allow_exhaustive_searches. Also, refactored the speed feature code
      to follow the general speed feature setting style.
      
      Change-Id: Ib96b182c4c8dfff4c1ab91d2497cc42bb9e5a4aa
      bca45646
  16. 20 Apr, 2017 1 commit
  17. 11 Apr, 2017 1 commit
    • Jerome Jiang's avatar
      vp9: speed >= 8: Adjust speed settings on ARM. · f16f08e5
      Jerome Jiang authored
      Set adaptive_rd_thresh to 2 when simple block yrd is not used.
      
      Fix regression caused by computing y sad without
      int_pro_motion_estimation on low res motion clips.
      
      Overall 0.07% quality loss on rtc_derf.
      
      Change only affects low res on speed 8.
      
      Change-Id: Ic6a188a56529f1034d6431005fb4b0e24e8a7e27
      f16f08e5
  18. 10 Apr, 2017 1 commit
    • Marco's avatar
      vp9: 1 pass CBR: avoid nonrd_pick_partition on segment. · 6557baf3
      Marco authored
      For speed 5, 1 pass CBR: Don't use the nonrd_pick_partition
      on the segment, rather use choose_partitioning followed by
      nonrd_select_partition (as is done on base segment).
      
      Little/no quality loss on RTC and RTC_derf (< 0.3%),
      speedup of at least 5%.
      
      Change-Id: I5273d5f950e60adf5e437b4ca8c4f63964641e83
      6557baf3
  19. 06 Apr, 2017 2 commits
  20. 05 Apr, 2017 1 commit
    • Marco's avatar
      vp9: Temporal denoising: avoid denoising for speed <= 5. · 2136de93
      Marco authored
      Temporal denoiser runs in non-rd pickmode, so it is only used
      for speed >= 5. Regression exists for speed 5, due to use of
      reference_partition (which use non-rd pickmode for partitioning).
      Avoid denoising for now at speed 5.
      
      Change-Id: I74a74d2e1404d7cfd33dcf4ec06dd2e503256cf0
      2136de93
  21. 31 Mar, 2017 1 commit
    • Yunqing Wang's avatar
      Enhance the row mt sync read to accept the sync_range greater than 1 · f1600db3
      Yunqing Wang authored
      The row mt sync read uses sync_range = 1, and wouldn't work if we want
      to use a sync_range that is greater than 1. To make it work, this sync
      read code is modified. Pass in col instead of col - 1 to make it
      consistent with other row mt code in VP9, and then add 1 in "while"
      codition.
      
      Change-Id: I4a0e487190ac5d47b8216368da12d80fec779c1a
      f1600db3
  22. 27 Mar, 2017 2 commits
    • Marco's avatar
      vp9: Speed >= 8: avoid chrome check under some condition. · 0169a985
      Marco authored
      For non-rd variance partition, avoid the chrome check
      unless y_sad is below some threshold.
      
      Small decrease in avgPSNR (~0.3) on RTC set.
      Small/negligible decrease on RTC_derf.
      
      Change-Id: I7af44235af514058ccf9a4f10bb737da9d720866
      0169a985
    • Marco's avatar
      vp9: 1 pass: Move source sad computation into encodeframe loop. · 66c6b4d6
      Marco authored
      Refactor to split the 1 passs source sad computation into scene
      detection (currently used for VBR and screen-content mode), and
      superblock based source sad computation (used in non-rd CBR mode).
      
      This allows the source sad computation for CBR mode to be
      multi-threaded.
      
      No change in compression.
      
      Change-Id: I112f2918613ccbd37c1771d852606d3af18c1388
      66c6b4d6
  23. 24 Mar, 2017 1 commit
  24. 23 Mar, 2017 1 commit
    • Marco's avatar
      vp9: Non-rd partition: avoid unneeded call to chrome_check · 4863e07c
      Marco authored
      Since y_sad is not computed yet (on the early exit due to source_sad),
      no need to check for setting color_sensitiviy.
      
      Only affects speed >=8. No change in behavior.
      
      Change-Id: I3a6f2d20fed38d8b8ec51b75bcacf9a21f2db916
      4863e07c
  25. 22 Mar, 2017 1 commit
  26. 21 Mar, 2017 1 commit
  27. 20 Mar, 2017 3 commits
    • Marco's avatar
      vp9: Nonrd variance partition: improve split to 16x16. · 3135b854
      Marco authored
      Add additional condition to split to 16x16, for resolutions <= 360p,
      reduces dragging artifact near moving boundary.
      
      Small/no change on RTC metrics.
      
      Change-Id: I314694f2166435d918f74e7ab42f002b07f40dae
      3135b854
    • Marco's avatar
      vp9: Use sb content measure to bias against golden. · 06c8713e
      Marco authored
      For each superblock, keep track of how far from current frame
      was the last significant content change, and use that (along
      with GF distance), to turnoff GF search in non-rd pickmode.
      
      Only enabled for speed >= 8.
      
      avgPNSR on RTC/RTC_derf down by ~0.9/1.2.
      Speedup on mac: ~3-5%.
      Speedup on arm: 3.6% for VGA and 4.4% for HD.
      
      Change-Id: Ic3f3d6a2af650aca6ba0064d2b1db8d48c035ac7
      06c8713e
    • Yunqing Wang's avatar
      Record the sum of tx block eobs in the partition block · 9c2552a1
      Yunqing Wang authored
      The sum of tx bloxk eobs is needed in the machine learning based partition
      early termination. The eobs are first accumulated during tx search, and
      then the value associated with the best tx_size is copied to ctx for later
      use.
      
      After the sum of eobs are calculated correctly, re-enabled
      ml_partition_search_early_termination speed feature.
      
      Re-did the quality/speed test to check the impact of the fix.
      
      1. Borg test BDRATE result:
      4k set:     PSNR: +0.183%; SSIM: +0.100%;
      hdres set:  PSNR: +0.168%; SSIM: +0.256%;
      midres set: PSNR: +0.186%; SSIM: +0.326%;
      
      2.Average speed gain result:
      4k clips: 21%;
      hd clips: 26%;
      midres clips: 15%.
      
      The result is in line with the original result.
      
      Change-Id: I4209a95c89be03b4cbfb6a95b16885f89feddbda
      9c2552a1
  28. 13 Mar, 2017 1 commit
    • Yunqing Wang's avatar
      Apply machine learning-based early termination in VP9 partition search · 67010143
      Yunqing Wang authored
      This patch was based on Yang Xian's intern project code. Further modifications
      were done.
      1. Moved machine-learning related parameters into the context structure.
      2. Corrected the calculation of sum_eobs.
      3. Removed unused parameters and calculations.
      4. Made it work with multiple tiles.
      5. Added a speed feature for the machine-learning based partition search
      early termination.
      6. Re-organized the code.
      
      The patch was rebased to the top-of-tree.
      
      Borg test BDRATE result:
      4k set:     PSNR: +0.144%; SSIM: +0.043%;
      hdres set:  PSNR: +0.149%; SSIM: +0.269%;
      midres set: PSNR: +0.127%; SSIM: +0.257%;
      
      Average speed gain result:
      4k clips: 22%;
      hd clips: 23%;
      midres clips: 15%.
      
      Change-Id: I0220e93a8277e6a7ea4b2c34b605966e3b1584ac
      67010143
  29. 08 Mar, 2017 1 commit
    • Yunqing Wang's avatar
      Make the partition search early termination feature to be frame size dependent · 099e9bf1
      Yunqing Wang authored
      The 2 thresholds(i.e. partition_search_breakout_dist_thr and
      partition_search_breakout_rate_thr) are used as the partition search
      early termination speed feature. This refactoring patch made this
      feature to be frame size dependent consistently throughout the code.
      
      Change-Id: Idaa0bd8400badaa0f8e2091e3f41ed2544e71be9
      099e9bf1