1. 12 Sep, 2014 - 1 commit
  2. 09 Sep, 2014 - 1 commit
    • Yunqing Wang's avatar
      Remove the use of use_lastframe_partitioning at speed 4 · f10d7eed
      Yunqing Wang authored
      The use of use_lastframe_partitioning is totally removed in good-
      quality encoding. Its usage in real-time encoding needs to be
      evaluated to see if it can be removed too.
      
      The Borg tests at speed 4 showed:
      stdhd set: 0.220% psnr gain, 0.166% ssim gain;
      derf set:  0.329% psnr gain, 0.476% ssim gain.
      
      Speed test on selected clips showed 1.54% speedup.(Worst case:
      pedestrian_area_1080p25.y4m, speed loss: 1.5%)
      
      Change-Id: I1c844d329b0b5678558439b887297c1be7ddab00
      f10d7eed
  3. 03 Sep, 2014 - 1 commit
    • Yaowu Xu's avatar
      select_tx_mode(): remove special case for key frame · c1058e5b
      Yaowu Xu authored
      This commit removes the special case for key frame, as transform size
      decision is controlled by the appropriate speed feature for all lossy
      coding modes: tx_size_search_method.
      
      Change-Id: I9677171e3f2432ec23705f7c5ea8170dd4562fae
      c1058e5b
  4. 28 Aug, 2014 - 1 commit
    • Yunqing Wang's avatar
      Early termination in encoding partition search · 4d2c3769
      Yunqing Wang authored
      In the partition search, the encoder checks all possible
      partitionings in the superblock's partition search tree.
      This patch proposed a set of criteria for partition search
      early termination, which effectively decided whether or
      not to terminate the search in current branch based on the
      "skippable" result of the quantized transform coefficients.
      The "skippable" information was gathered during the
      partition mode search, and no overhead calculations were
      introduced.
      
      This patch gives significant encoding speed gains without
      sacrificing the quality.
      
      Borg test results:
      1. At speed 1,
         stdhd set: psnr: +0.074%, ssim: +0.093%;
         derf set:  psnr: -0.024%, ssim: +0.011%;
      2. At speed 2,
         stdhd set: psnr: +0.033%, ssim: +0.100%;
         derf set:  psnr: -0.062%, ssim: +0.003%;
      3. At speed 3,
         stdhd set: psnr: +0.060%, ssim: +0.190%;
         derf set:  psnr: -0.064%, ssim: -0.002%;
      4. At speed 4,
         stdhd set: psnr: +0.070%, ssim: +0.143%;
         derf set:  psnr: -0.104%, ssim: +0.039%;
      
      The speedup ranges from several percent to 60+%.
                       speed1    speed2    speed3    speed4
      (1080p, 100f):
      old_town_cross:  48.2%     23.9%     20.8%     16.5%
      park_joy:        11.4%     17.8%     29.4%     18.2%
      pedestrian_area: 10.7%      4.0%      4.2%      2.4%
      (720p, 200f):
      mobcal:          68.1%     36.3%     34.4%     17.7%
      parkrun:         15.8%     24.2%     37.1%     16.8%
      shields:         45.1%     32.8%     30.1%      9.6%
      (cif, 300f)
      bus:              3.7%     10.4%     14.0%      7.9%
      deadline:        13.6%     14.8%     12.6%     10.9%
      mobile:           5.3%     11.5%     14.7%     10.7%
      
      Change-Id: I246c38fb952ad762ce5e365711235b605f470a66
      4d2c3769
  5. 25 Aug, 2014 - 2 commits
  6. 15 Aug, 2014 - 2 commits
    • Pengchong Jin's avatar
      Add a speed feature to give the tighter search range · eca93642
      Pengchong Jin authored
      Add a speed feature to give the tighter partition search
      range. Before partition search, calculate the histogram
      of the partition sizes of the left, above and previous
      co-located blocks of the current block. If the variance of
      observed partition sizes is small enough, adjust the search
      range around the mean partition size, which will be tigher.
      
      The feature is currently turned on at speed 2. Experiments on
      sample youtube clips show on average the runtime is reduced
      by 3-7%.
      
      For hard stdhd clips:
      park_joy_1080p @ 15000kbps:       509251 ms -> 491953 ms (3.3%)
      pedestrian_area_1080p @ 2000kbps: 223941 ms -> 214226 ms (4.3%)
      
      The PSNR performance is changed:
      derf: -0.112%
      yt:   -0.099%
      hd:   -0.090%
      stdhd:-0.102%
      
      Change-Id: Ie205ec5325bf92ec5676c243e30ba9d0adca10f2
      eca93642
    • Yunqing Wang's avatar
      Remove a unused speed feature · 28b1437d
      Yunqing Wang authored
      Removed disable_split_var_thresh, which is not used anymore.
      
      Change-Id: I50119b150442e1571157433b5effc6aae0dbe0fd
      28b1437d
  7. 14 Aug, 2014 - 2 commits
  8. 13 Aug, 2014 - 1 commit
    • Yaowu Xu's avatar
      Simplify select_tx_mode() · b6a41802
      Yaowu Xu authored
      The function is called only once, right after all stats counters are
      reset to 0. Therefore all the computations have zero effect on return
      values. This commmit to removed those effectless code.
      
      Change-Id: I50d27c0802547921fa36c60aa4bd92d76247f595
      b6a41802
  9. 08 Aug, 2014 - 1 commit
    • Dmitry Kovalev's avatar
      Moving pass from VP9_COMP to VP9EncoderConfig. · 91c2f1e4
      Dmitry Kovalev authored
      We had a very complicated way to initialize cpi->pass from
      cfg->g_pass:
      switch (cfg->g_pass) {
        case VPX_RC_ONE_PASS:
          oxcf->mode = ONE_PASS_GOOD;
          break;
        case VPX_RC_FIRST_PASS:
          oxcf->mode = TWO_PASS_FIRST;
          break;
        case VPX_RC_LAST_PASS:
          oxcf->mode = TWO_PASS_SECOND_BEST;
          break;
      }
      
      cpi->pass = get_pass(oxcf->mode).
      
      Now pass is moved to VP9EncoderConfig and initialization is simple:
      switch (cfg->g_pass) {
        case VPX_RC_ONE_PASS:
          oxcf->pass = 0;
          break;
        case VPX_RC_FIRST_PASS:
          oxcf->pass = 1;
          break;
        case VPX_RC_LAST_PASS:
          oxcf->pass = 2;
          break;
      }
      
      Change-Id: I8f582203a4575f5e39b071598484a8ad2b72e0d9
      91c2f1e4
  10. 07 Aug, 2014 - 3 commits
  11. 06 Aug, 2014 - 1 commit
    • Jingning Han's avatar
      Integrate fast txfm and quant path into skip_recode system · 8684c232
      Jingning Han authored
      This commit integrates the fast transform and quantization process
      into skip_recode scheme in the rate-distortion optimization loop.
      Previously the fast transform and quantization process was only
      enabled for non-RD coding flow.
      
      Change-Id: Ib7db4d39b7033f1495c75897271f769799198ba8
      8684c232
  12. 05 Aug, 2014 - 2 commits
    • Pengchong Jin's avatar
      Directly split the block in partition search · 74593c1e
      Pengchong Jin authored
      This patch allows the encoder to directly split the block
      in partition search, therefore skip searching NONE. It
      computes a score which measures whether 16x16 motion vectors
      from the first pass in the current block are consistent with
      each others. If they are inconsistent and we have enough Q
      to encode, split the block directly, and skip searching NONE.
      
      This feature is under flag CONFIG_FP_MB_STATS. In speed 2,
      it further gives a speedup of 3-8% on sample yt clips as
      compared to the previous version under the same flag. Overall,
      the features under the flag will give 7-15% on typical yt
      clips at up to 6000kbps data rate. The speedup at very high
      data rate is not significant.
      
      For hard stdhd clips:
      park_joy_1080p @ 15000kbps:       504541ms -> 506293ms (-0.35%)
      pedestrian_area_1080p @ 2000kbps: 326610ms -> 290090ms (+11.2%)
      
      The compression performance using the features under the flag:
      derf: -0.068%
      yt:   -0.189%
      hd:   -0.318%
      stdhd:-0.183%
      
      To use the feature, set CONFIG_FP_MB_STATS and turn on
      cpi->use_fp_mb_stats.
      
      Change-Id: Iad58a2966515c8861aa9eb211565b1864048d47f
      74593c1e
    • Jingning Han's avatar
      Extend skip_txfm flag into array to cover YUV planes · 1a8d45f3
      Jingning Han authored
      Change-Id: Ieae182d72d625d0d3fd4ed7c7d24cb521a0f21b0
      1a8d45f3
  13. 04 Aug, 2014 - 2 commits
  14. 31 Jul, 2014 - 1 commit
  15. 30 Jul, 2014 - 3 commits
    • Pengchong Jin's avatar
      Early termination after partition NONE is done in RD. · 49866baa
      Pengchong Jin authored
      This patch allows the encoder to skip the search for partition
      SPLIT, HORZ, VERT after the search for partition NONE is done
      in RD optimization. It uses the first pass block-wise statistics
      to make the decision. If all 16x16 blocks in the current partition
      have zero motions and small residues from the frist pass statistics,
      and it has small difference variance, further partition search is
      skipped.
      
      For speed 2 setting, experiments on general youtube clips show that
      the speedup varies from 1% - 10%, 5% on average. On the performance
      side in PSNR, derf 0.004%, yt -0.059%, hd -0.106%, stdhd 0.032%.
      
      For hard stdhd clips:
      park_joy_1080p, 502952 ms -> 503307 ms (-0.07%)
      pedestrian_area_1080p, 227049 ms -> 220531 ms (+3%)
      
      This feature is under the compilation flag CONFIG_FP_MB_STATS and
      it is off in current setting.
      
      Change-Id: I554537e9242178263b65ebe14a04f9c221b58bae
      49866baa
    • Jingning Han's avatar
      Refactor rd_pick_parition interface · d82ff942
      Jingning Han authored
      Remove the variable that indicates the relative block index. This
      is explicitly covered by the use of pc_tree.
      
      Change-Id: Ib13142582fff926c85e375bde656aa050add8350
      d82ff942
    • Jingning Han's avatar
      Chessboard pattern partition search · ca2dcb7f
      Jingning Han authored
      This commit enables a chessboard pattern constrained partition
      search for 720p and above resolutions. The scheme applies stricter
      partition search to alternative blocks based on its above/left
      neighboring blocks' partition range, as well as that of the
      collocated blocks in the previous frame. It is currently turned
      on at 16x16 block size level. The chessboard pattern is flipped
      per coding frame.
      
      The speed 3 runtime is reduced:
      park_joy_1080p, 652832 ms -> 607738 ms (7% speed-up)
      pedestrian_area_1080p, 215998 ms -> 200589 ms (8% speed-up)
      
      The compression performance is changed:
      hd     -0.223%
      stdhd  -0.295%
      
      Change-Id: I2d4d123ae89f7171562f618febb4d81789575b19
      ca2dcb7f
  16. 29 Jul, 2014 - 2 commits
  17. 25 Jul, 2014 - 1 commit
    • Jingning Han's avatar
      Fix rd_pick_partition search loop for 4x4 blocks · 84af0486
      Jingning Han authored
      The partition search for 4x4 blocks takes unnecessary steps to
      reconstruct pixels and an extra partition type update. This commit
      removes such operations. No visible compression/speed difference.
      Thanks to Yue (yuec@) for finding this issue.
      
      Change-Id: I3f83824aa3fd3717d63be0b280fa57258939a70a
      84af0486
  18. 24 Jul, 2014 - 1 commit
  19. 22 Jul, 2014 - 1 commit
    • Adrian Grange's avatar
      Fix get_frame_type function · caad1686
      Adrian Grange authored
      Fixed the function get_frame_type to return the correct
      frame type for golden and last frames.
      
      Change-Id: I8edddd9aa26cbe7a1de8ff211389410b22b1bd14
      caad1686
  20. 21 Jul, 2014 - 2 commits
  21. 17 Jul, 2014 - 1 commit
  22. 15 Jul, 2014 - 1 commit
    • Tim Kopp's avatar
      VP9 Denoiser denoises after mode/bsize search · 03819ed9
      Tim Kopp authored
      In vp8, statistics are collected about the different modes as they are searched.
      This process is more complicated due to the variable block size. Fields were
      added to the PICM_MODE_CONTEXT struct to hold this information for each point in
      the search. The information is then taken from the appropriate part of the tree
      during denoising.
      
      Change-Id: I89261ab77ad637821287ae157dfdf694702b8e77
      03819ed9
  23. 11 Jul, 2014 - 1 commit
  24. 07 Jul, 2014 - 1 commit
  25. 02 Jul, 2014 - 2 commits
    • Alex Converse's avatar
      Split vp9_rdopt into vp9_rdopt and vp9_rd. · 03c276ea
      Alex Converse authored
      vp9_rdopt is for making rd optimal mode decisions. vp9_rd is for all
      other rd related routines. Anything used outside of making an rd optimal
      decision belongs in rd.
      
      Change-Id: I772a3073f7588bdf139f551fb9810b6864d8e64b
      03c276ea
    • Jingning Han's avatar
      Re-design quantization process · 9ac2f663
      Jingning Han authored
      This commit re-designs the quantization process for transform
      coefficient blocks of size 4x4 to 16x16. It improves compression
      performance for speed 7 by 3.85%. The SSSE3 version for the
      new quantization process is included.
      
      The average runtime of the 8x8 block quantization is reduced
      from 285 cycles -> 255 cycles, i.e., over 10% faster.
      
      Change-Id: I61278aa02efc70599b962d3314671db5b0446a50
      9ac2f663
  26. 01 Jul, 2014 - 1 commit
  27. 30 Jun, 2014 - 2 commits
    • Yaowu Xu's avatar
      change to not force interp_type as SWITCHABLE · 186bd4eb
      Yaowu Xu authored
      Encoder still uses SWITCHABLE as default via DEFAULT_INTERP_FILTER,
      but does not override the default if it is not SWITCHABLE.
      
      Change-Id: I3c0f6653bd228381a623a026c66599b0a87d01d5
      186bd4eb
    • Jingning Han's avatar
      Remove unused set_mode_info function · 30ab3701
      Jingning Han authored
      When the frame is intra coded only, the encoder takes the RD
      coding flow. Hence the function set_mode_info is not practically
      in use. This commit removes it and the associated conditional
      branches.
      
      Change-Id: I1e42659ceb55b771ba712d1cdecacb446aa6460d
      30ab3701