1. 31 Jul, 2017 1 commit
    • Marco's avatar
      vp9: Fix denoising condition when pickmode partition is used. · 999bd6ea
      Marco authored
      When the superblock partition is based on the nonrd-pickmode,
      we need to avoid the denoising. Current condition was based on
      the speed level. This change is to make the condition at the
      superblock level, as the switch in partitioning may be done at
      sb level based on source_sad (e.g., in speed 6).
      
      Change-Id: I12ece4f60b93ed34ee65ff2d6cdce1213c36de04
      999bd6ea
  2. 17 Jul, 2017 1 commit
    • Marco's avatar
      vp9: Reuse motion from choose_partitioning in NEWMV search. · 0c9e2f4c
      Marco authored
      When int_pro_motion_estimation is done for superblock in
      choose_partitioning, use it to avoid the full_pixel_search
      for NEWMV mode, if bsize is >= 32X32.
      
      For speed > 7.
      Small/neutral change on RTC metrics.
      ~1-2% speedup on arm on high motion clip.
      
      Change-Id: I3cfe6833ff4bf75d4afa83eaf058ad45729de85b
      0c9e2f4c
  3. 07 Jul, 2017 1 commit
  4. 29 Jun, 2017 1 commit
  5. 03 May, 2017 1 commit
  6. 25 Apr, 2017 1 commit
    • Marco's avatar
      vp9; Reduce artifact in non-rd pickmode for lighting changes. · 92ec0674
      Marco authored
      Add a low-variance high-sumdiff to the superblock content state
      and use it to limit the mv and bias some decisions in non-rd pickmode.
      Only affects speed >= 6.
      
      Reduces artifact for lighting changes.
      Small/no difference in metrics on RTC set.
      
      Change-Id: Ic84b2379fe0ae3fa71ae826ee6bae3eaf551a25b
      92ec0674
  7. 21 Apr, 2017 1 commit
    • Yunqing Wang's avatar
      Make allow_exhaustive_searches feature no longer adaptive · bca45646
      Yunqing Wang authored
      A previous patch turned on allow_exhaustive_searches feature only for
      FC_GRAPHICS_ANIMATION content. This patch further modified the feature
      by removing the exhaustive search limit, and made it no longer adaptive.
      As a result, the 2 counts that recorded the number of motion searches
      were removed, which helped achieve the determinism in the row based
      multi-threading encoding. Tests showed that this patch didn't cause
      the encoder much slower.
      
      Used exhaustive_searches_thresh for this speed feature, and removed
      allow_exhaustive_searches. Also, refactored the speed feature code
      to follow the general speed feature setting style.
      
      Change-Id: Ib96b182c4c8dfff4c1ab91d2497cc42bb9e5a4aa
      bca45646
  8. 27 Mar, 2017 1 commit
    • Marco's avatar
      vp9: 1 pass: Move source sad computation into encodeframe loop. · 66c6b4d6
      Marco authored
      Refactor to split the 1 passs source sad computation into scene
      detection (currently used for VBR and screen-content mode), and
      superblock based source sad computation (used in non-rd CBR mode).
      
      This allows the source sad computation for CBR mode to be
      multi-threaded.
      
      No change in compression.
      
      Change-Id: I112f2918613ccbd37c1771d852606d3af18c1388
      66c6b4d6
  9. 20 Mar, 2017 2 commits
    • Marco's avatar
      vp9: Use sb content measure to bias against golden. · 06c8713e
      Marco authored
      For each superblock, keep track of how far from current frame
      was the last significant content change, and use that (along
      with GF distance), to turnoff GF search in non-rd pickmode.
      
      Only enabled for speed >= 8.
      
      avgPNSR on RTC/RTC_derf down by ~0.9/1.2.
      Speedup on mac: ~3-5%.
      Speedup on arm: 3.6% for VGA and 4.4% for HD.
      
      Change-Id: Ic3f3d6a2af650aca6ba0064d2b1db8d48c035ac7
      06c8713e
    • Yunqing Wang's avatar
      Record the sum of tx block eobs in the partition block · 9c2552a1
      Yunqing Wang authored
      The sum of tx bloxk eobs is needed in the machine learning based partition
      early termination. The eobs are first accumulated during tx search, and
      then the value associated with the best tx_size is copied to ctx for later
      use.
      
      After the sum of eobs are calculated correctly, re-enabled
      ml_partition_search_early_termination speed feature.
      
      Re-did the quality/speed test to check the impact of the fix.
      
      1. Borg test BDRATE result:
      4k set:     PSNR: +0.183%; SSIM: +0.100%;
      hdres set:  PSNR: +0.168%; SSIM: +0.256%;
      midres set: PSNR: +0.186%; SSIM: +0.326%;
      
      2.Average speed gain result:
      4k clips: 21%;
      hd clips: 26%;
      midres clips: 15%.
      
      The result is in line with the original result.
      
      Change-Id: I4209a95c89be03b4cbfb6a95b16885f89feddbda
      9c2552a1
  10. 11 Mar, 2017 1 commit
  11. 15 Feb, 2017 1 commit
    • Ranjit Kumar Tulabandu's avatar
      Row based multi-threading of encoding stage · 71061e93
      Ranjit Kumar Tulabandu authored
      (Yunqing Wang)
      This patch implements the row-based multi-threading within tiles in
      the encoding pass, and substantially speeds up the multi-threaded
      encoder in VP9.
      
      Speed tests at speed 1 on STDHD(using 4 tiles) set show that the
      average speedups of the encoding pass(second pass in the 2-pass
      encoding) is 7% while using 2 threads, 16% while using 4 threads,
      85% while using 8 threads, and 116% while using 16 threads.
      
      Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de
      71061e93
  12. 24 Jan, 2017 1 commit
  13. 20 Jan, 2017 1 commit
  14. 25 Aug, 2016 1 commit
    • paulwilkins's avatar
      Adjust coefficient optimization and tx_domain rd speed features. · 635ae8bd
      paulwilkins authored
      Previously Tx domain rd was used in all cases above speed 0.
      Coefficient optimization was only enabled for best and speed 0.
      
      This patch selectively sets these features at other speed settings
      based on block complexity.
      
      For the Netflix and HD sets in particular the quality gains are
      large compared to the speed hit. At speed 1 the average psnr
      gain in the NF set  is > 2.5% with one clip coming in at 18%
      and some points almost 30%.  Average gains for the lower
      resolution test sets are around 1%.
      
      The gains are biggest at low Q so some further optimization
      may be possible.
      
      Change-Id: I340376c7b2a78e5389a34b7ebdc41072808d0576
      635ae8bd
  15. 08 Aug, 2016 1 commit
  16. 02 Aug, 2016 1 commit
  17. 13 Jun, 2016 1 commit
    • JackyChen's avatar
      vp9: Encoding cycle reduction for speed 8. · f9c05872
      JackyChen authored
      1. Skip golden non-zeromv and newmv-last for bsize >= 16x16 if the
      temporal variance obtained from choose_partitioning is very low.
      2. Skip horz and vert INTRA mode for speed 8.
      
      This change works best on the clips with little noise and with some
      motion (e.g. gips_motion which has > 5% speed up). PSNR drop is 1.78%
      on rtc test set, no obvious visual quality regression found.
      
      Change-Id: Ib43b5b20e67809d03c5a6890818ddff59e1fc94a
      f9c05872
  18. 01 Jun, 2016 1 commit
    • jackychen's avatar
      vp9: Skip some modes when variance is low for big blocks, for 1 pass real-time. · bacc67f4
      jackychen authored
      Skip intra-mode and some inter-modes (newmv, nearmv, nearestmv) for
      golden frame if the variance got from choose_partitioning is very low.
      Only for 1 pass real-time CBR mode and bsize >= 32x32, it has ~2.5%
      speed up with less than 0.1% PSNR drop for rtc test set. Don't see
      visual regression.
      
      Change-Id: I70efbc95a1007231ae36f02c5b2fbf6cd35077ad
      bacc67f4
  19. 09 Feb, 2016 1 commit
    • Alex Converse's avatar
      Restore previous motion search bit-error scale. · fac947df
      Alex Converse authored
      The bit to error transformation got doubled as a result of going from
      8-bit to 9-bit costs (change d13385ce).
      
      Use defines to derive the scale numbers and comment some of the fields.
      
      derf: -0.023 BDRATE
      hevcmr: +0.067 BDRATE
      stdhd: +0.098 BDRATE
      (These are substantially smaller than than the original gains from 8 to
      9 bit costing.)
      
      Change-Id: I6a2b3b029b2f1415e4f90a05709b2333ec0eea9b
      fac947df
  20. 28 Jan, 2016 1 commit
    • Marco's avatar
      vp9 non-rd mode: Modification for detected skin areas. · b39a599c
      Marco authored
      If a superblock contains alot of "skin" then force split
      of 64x64 partition, and make some adjustments in mode selection.
      
      This helps to reduce artifacts on moving face/skin areas at low bitrates.
      
      Little/no change in metrics: avgPSNR/SSIM down by ~0.12%.
      Small encoding time increase < 1%.
      
      Change-Id: Ic57f52148c3716f391419fab0530d916e4c1d186
      b39a599c
  21. 13 Nov, 2015 1 commit
    • paulwilkins's avatar
      Changes to exhaustive motion search. · 0149fb3d
      paulwilkins authored
      This change alters the nature and use of exhaustive motion search.
      
      Firstly any exhaustive search is preceded by a normal step search.
      The exhaustive search is only carried out if the distortion resulting
      from the step search is above a threshold value.
      
      Secondly the simple +/- 64 exhaustive search is replaced by a
      multi stage mesh based search where each stage has a range
      and step/interval size. Subsequent stages use the best position from
      the previous stage as the center of the search but use a reduced range
      and interval size.
      
      For example:
        stage 1: Range +/- 64 interval 4
        stage 2: Range +/- 32 interval 2
        stage 3: Range +/- 15 interval 1
      
      This process, especially when it follows on from a normal step
      search, has shown itself to be almost as effective as a full range
      exhaustive search with step 1 but greatly lowers the computational
      complexity such that it can be used in some cases for speeds 0-2.
      
      This patch also removes a double exhaustive search for sub 8x8 blocks
      which also contained  a bug (the two searches used different distortion
      metrics).
      
      For best quality in my test animation sequence this patch has almost
      no impact on quality but improves encode speed by more than 5X.
      
      Restricted use in good quality speeds 0-2 yields significant quality gains
      on the animation test of 0.2 - 0.5 db with only a small impact on encode
      speed. On most clips though the quality gain and speed impact are small.
      
      Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa
      0149fb3d
  22. 31 Jul, 2015 1 commit
  23. 29 Jul, 2015 1 commit
  24. 29 Jun, 2015 1 commit
  25. 04 Feb, 2015 1 commit
    • Jingning Han's avatar
      Account for chroma component costs in RTC mode decision · 0c6d3a03
      Jingning Han authored
      This commit allows the encoder to account for additional chroma
      plane costs in the mode decision process, if the current block
      potentially contains significant color change. It improves the
      visual quality at very low bit-rates.
      
      The compression performance of dark720p is improved by 12.39% in
      speed 6. For jimred at 150 kbps, the PSNR of V component (red)
      increased by 0.2 dB, at the expense of about 5% increase in
      encoding time. Note that for sequences where the chroma components
      are fairly consistent, the encoding time increase is negligible.
      
      On average the rtc set compression performance is improved by
      1.172% in PSNR and 1.920% in SSIM.
      
      Change-Id: Ia55b24ef23a25304f7ec9958fbf07fd6e658505c
      0c6d3a03
  26. 22 Dec, 2014 1 commit
  27. 19 Dec, 2014 1 commit
  28. 18 Dec, 2014 1 commit
  29. 20 Nov, 2014 1 commit
    • Yunqing Wang's avatar
      vp9_ethread: move max/min partition size to mb struct · ad7586a9
      Yunqing Wang authored
      The max_partition_size and max_partition_size are set at the
      beginning while setting speed features, and then adjusted at
      SB level. Moving them to mb struct ensures there is a local
      copy for each thread.
      
      Change-Id: I7dd08dc918d9f772fcd718bbd6533e0787720ad4
      ad7586a9
  30. 06 Nov, 2014 1 commit
    • Jingning Han's avatar
      Rework cut-off decisions in cyclic refresh aq mode · caaf63b2
      Jingning Han authored
      This commit removes the cyclic aq mode dependency on
      in_static_area and reworks the corresponding cut-off thresholds.
      It improves the compression performance of speed -5 by 1.47% in
      PSNR and 2.07% in SSIM, and the compression performance of speed
      -6 by 3.10% in PSNR and 5.25% in SSIM. Speed wise, about 1% faster
      in both settings at high bit-rates.
      
      Change-Id: I1ffc775afdc047964448d9dff5751491ba4ff4a9
      caaf63b2
  31. 09 Oct, 2014 1 commit
  32. 01 Oct, 2014 1 commit
  33. 12 Sep, 2014 1 commit
    • Deb Mukherjee's avatar
      Adds high bitdepth transform functions and tests · 10783d4f
      Deb Mukherjee authored
      Adds various high bitdepth transform functions and tests.
      Much of the changes are related to using typedefs tran_low_t
      and tran_high_t for the final transform cofficients and intermediate
      stages of the transform computation respectively rather than fixed
      types int16_t/int. When vp9_highbitdepth configure flag is off,
      these map tp int16_t/int32_t, but when the flag is on, they map
      to int32_t/int64_t to make space for needed extra precision.
      
      Change-Id: I3c56de79e15b904d6f655b62ffae170729befdd8
      10783d4f
  34. 03 Sep, 2014 1 commit
    • Jingning Han's avatar
      Speed up compound inter prediction mode check · d62d804e
      Jingning Han authored
      This commit allows the encoder to store outcomes of single reference
      frame modes and compares them to decide if the inter prediction
      filter, forward transform, and quantization can be skipped.
      
      The compression performance of speed 3 is down
      derf  -0.364%
      stdhd -0.198%
      
      For test sequences, the speed 3 runtime is reduced
      highway CIF 100 kbps, 51976 ms -> 45033 ms, 13% speed-up
      stockholm 720p 1000 kbps, 71826 ms -> 67838 ms, 5.5% speed-up
      pedestrian 1080p 2000 kbps, 154924 ms -> 150702 ms, 2.6% speed-up
      
      Change-Id: I5aa26f918d2b4b5197a2c0afa2779319f1c88e44
      d62d804e
  35. 29 Aug, 2014 1 commit
  36. 22 Aug, 2014 1 commit
    • Jingning Han's avatar
      Move mv cost table to VP9_COMP · 2b1c6eac
      Jingning Han authored
      The mv cost table set is maintained at frame level, hence moved to
      VP9_COMP.
      
      Change-Id: Icb3d0185d47443590bd11357de729aa4ba5c5e5e
      2b1c6eac
  37. 08 Aug, 2014 1 commit
  38. 05 Aug, 2014 1 commit
  39. 04 Aug, 2014 1 commit