1. 05 Aug, 2014 - 1 commit
    • Pengchong Jin's avatar
      Directly split the block in partition search · 74593c1e
      Pengchong Jin authored
      This patch allows the encoder to directly split the block
      in partition search, therefore skip searching NONE. It
      computes a score which measures whether 16x16 motion vectors
      from the first pass in the current block are consistent with
      each others. If they are inconsistent and we have enough Q
      to encode, split the block directly, and skip searching NONE.
      
      This feature is under flag CONFIG_FP_MB_STATS. In speed 2,
      it further gives a speedup of 3-8% on sample yt clips as
      compared to the previous version under the same flag. Overall,
      the features under the flag will give 7-15% on typical yt
      clips at up to 6000kbps data rate. The speedup at very high
      data rate is not significant.
      
      For hard stdhd clips:
      park_joy_1080p @ 15000kbps:       504541ms -> 506293ms (-0.35%)
      pedestrian_area_1080p @ 2000kbps: 326610ms -> 290090ms (+11.2%)
      
      The compression performance using the features under the flag:
      derf: -0.068%
      yt:   -0.189%
      hd:   -0.318%
      stdhd:-0.183%
      
      To use the feature, set CONFIG_FP_MB_STATS and turn on
      cpi->use_fp_mb_stats.
      
      Change-Id: Iad58a2966515c8861aa9eb211565b1864048d47f
      74593c1e
  2. 04 Aug, 2014 - 2 commits
  3. 31 Jul, 2014 - 1 commit
  4. 30 Jul, 2014 - 3 commits
    • Pengchong Jin's avatar
      Early termination after partition NONE is done in RD. · 49866baa
      Pengchong Jin authored
      This patch allows the encoder to skip the search for partition
      SPLIT, HORZ, VERT after the search for partition NONE is done
      in RD optimization. It uses the first pass block-wise statistics
      to make the decision. If all 16x16 blocks in the current partition
      have zero motions and small residues from the frist pass statistics,
      and it has small difference variance, further partition search is
      skipped.
      
      For speed 2 setting, experiments on general youtube clips show that
      the speedup varies from 1% - 10%, 5% on average. On the performance
      side in PSNR, derf 0.004%, yt -0.059%, hd -0.106%, stdhd 0.032%.
      
      For hard stdhd clips:
      park_joy_1080p, 502952 ms -> 503307 ms (-0.07%)
      pedestrian_area_1080p, 227049 ms -> 220531 ms (+3%)
      
      This feature is under the compilation flag CONFIG_FP_MB_STATS and
      it is off in current setting.
      
      Change-Id: I554537e9242178263b65ebe14a04f9c221b58bae
      49866baa
    • Jingning Han's avatar
      Refactor rd_pick_parition interface · d82ff942
      Jingning Han authored
      Remove the variable that indicates the relative block index. This
      is explicitly covered by the use of pc_tree.
      
      Change-Id: Ib13142582fff926c85e375bde656aa050add8350
      d82ff942
    • Jingning Han's avatar
      Chessboard pattern partition search · ca2dcb7f
      Jingning Han authored
      This commit enables a chessboard pattern constrained partition
      search for 720p and above resolutions. The scheme applies stricter
      partition search to alternative blocks based on its above/left
      neighboring blocks' partition range, as well as that of the
      collocated blocks in the previous frame. It is currently turned
      on at 16x16 block size level. The chessboard pattern is flipped
      per coding frame.
      
      The speed 3 runtime is reduced:
      park_joy_1080p, 652832 ms -> 607738 ms (7% speed-up)
      pedestrian_area_1080p, 215998 ms -> 200589 ms (8% speed-up)
      
      The compression performance is changed:
      hd     -0.223%
      stdhd  -0.295%
      
      Change-Id: I2d4d123ae89f7171562f618febb4d81789575b19
      ca2dcb7f
  5. 29 Jul, 2014 - 2 commits
  6. 25 Jul, 2014 - 1 commit
    • Jingning Han's avatar
      Fix rd_pick_partition search loop for 4x4 blocks · 84af0486
      Jingning Han authored
      The partition search for 4x4 blocks takes unnecessary steps to
      reconstruct pixels and an extra partition type update. This commit
      removes such operations. No visible compression/speed difference.
      Thanks to Yue (yuec@) for finding this issue.
      
      Change-Id: I3f83824aa3fd3717d63be0b280fa57258939a70a
      84af0486
  7. 24 Jul, 2014 - 1 commit
  8. 22 Jul, 2014 - 1 commit
    • Adrian Grange's avatar
      Fix get_frame_type function · caad1686
      Adrian Grange authored
      Fixed the function get_frame_type to return the correct
      frame type for golden and last frames.
      
      Change-Id: I8edddd9aa26cbe7a1de8ff211389410b22b1bd14
      caad1686
  9. 21 Jul, 2014 - 2 commits
  10. 17 Jul, 2014 - 1 commit
  11. 15 Jul, 2014 - 1 commit
    • Tim Kopp's avatar
      VP9 Denoiser denoises after mode/bsize search · 03819ed9
      Tim Kopp authored
      In vp8, statistics are collected about the different modes as they are searched.
      This process is more complicated due to the variable block size. Fields were
      added to the PICM_MODE_CONTEXT struct to hold this information for each point in
      the search. The information is then taken from the appropriate part of the tree
      during denoising.
      
      Change-Id: I89261ab77ad637821287ae157dfdf694702b8e77
      03819ed9
  12. 11 Jul, 2014 - 1 commit
  13. 07 Jul, 2014 - 1 commit
  14. 02 Jul, 2014 - 2 commits
    • Alex Converse's avatar
      Split vp9_rdopt into vp9_rdopt and vp9_rd. · 03c276ea
      Alex Converse authored
      vp9_rdopt is for making rd optimal mode decisions. vp9_rd is for all
      other rd related routines. Anything used outside of making an rd optimal
      decision belongs in rd.
      
      Change-Id: I772a3073f7588bdf139f551fb9810b6864d8e64b
      03c276ea
    • Jingning Han's avatar
      Re-design quantization process · 9ac2f663
      Jingning Han authored
      This commit re-designs the quantization process for transform
      coefficient blocks of size 4x4 to 16x16. It improves compression
      performance for speed 7 by 3.85%. The SSSE3 version for the
      new quantization process is included.
      
      The average runtime of the 8x8 block quantization is reduced
      from 285 cycles -> 255 cycles, i.e., over 10% faster.
      
      Change-Id: I61278aa02efc70599b962d3314671db5b0446a50
      9ac2f663
  15. 01 Jul, 2014 - 1 commit
  16. 30 Jun, 2014 - 3 commits
    • Yaowu Xu's avatar
      change to not force interp_type as SWITCHABLE · 186bd4eb
      Yaowu Xu authored
      Encoder still uses SWITCHABLE as default via DEFAULT_INTERP_FILTER,
      but does not override the default if it is not SWITCHABLE.
      
      Change-Id: I3c0f6653bd228381a623a026c66599b0a87d01d5
      186bd4eb
    • Jingning Han's avatar
      Remove unused set_mode_info function · 30ab3701
      Jingning Han authored
      When the frame is intra coded only, the encoder takes the RD
      coding flow. Hence the function set_mode_info is not practically
      in use. This commit removes it and the associated conditional
      branches.
      
      Change-Id: I1e42659ceb55b771ba712d1cdecacb446aa6460d
      30ab3701
    • Yunqing Wang's avatar
      Decide the partitioning threshold from the variance histogram · 9d41313e
      Yunqing Wang authored
      Before encoding a frame, calculate and store each 16x16 block's
      variance of source difference between last and current frame.
      Find partitioning threshold T for the frame from its variance
      histogram, and then use T to make partition decisions.
      
      Comparing with fixed 16x16 partitioning, rtc set test showed an
      overall psnr gain of 3.242%, and ssim gain of 3.751%. The best
      psnr gain is 8.653%.
      
      The overall encoding speed didn't change much. It got faster for
      some clips(for example, 12% speedup for vidyo1), and a little
      slower for others.
      
      Also, a minor modification was made in datarate unit test.
      
      Change-Id: Ie290743aa3814e83607b93831b667a2a49d0932c
      9d41313e
  17. 29 Jun, 2014 - 1 commit
  18. 26 Jun, 2014 - 2 commits
    • Jingning Han's avatar
      Adaptive txfm size selection depending on residual sse/variance · 5a3e3c6d
      Jingning Han authored
      This commit enables an adaptive transform size selection method
      for speed -6. It uses largest transform size when the sse is more
      than 4 times of variance, i.e., most energy is compacted in the
      DC coefficient. Otherwise, use the default TX_8X8. It improves
      the compression efficiency for rtc set of speed -6 by 0.8%, no
      speed change observed.
      
      Change-Id: Ie6ed1e728ff7bf88ebe940a60811361cdd19969c
      5a3e3c6d
    • Pengchong Jin's avatar
      Skip the partition search for the frame with no motion · 12861260
      Pengchong Jin authored
      This patch allows the encoder to skip the partition search for the
      frame if it is an inter frame and only zero motion vectors have
      been detected in the first pass. The partition size is directly
      assigned according to the difference variance.
      
      Borg tests show overall little performance changes in term of PSNR
      (derf -0.027%, yt 0.152%, hd 0.078%, stdhd 0%). The worst case of
      PSNR loss is -0.514% from yt. The best PSNR gain is 4.293% from yt.
      The second pass encoding speedup for slideshow clips is 15%-40%.
      
      Change-Id: I881f347d286553ee5594a9ea09ba1a61ac684045
      12861260
  19. 24 Jun, 2014 - 2 commits
    • Yunqing Wang's avatar
      Reuse inter prediction result in real-time speed 6 · 0aae1000
      Yunqing Wang authored
      In real-time speed 6, no partition search is done. The inter
      prediction results got from picking mode can be reused in the
      following encoding process. A speed feature reuse_inter_pred_sby
      is added to only enable the resue in speed 6.
      
      This patch doesn't change encoding result. RTC set tests showed
      that the encoding speed gain is 2% - 5%.
      
      Change-Id: I3884780f64ef95dd8be10562926542528713b92c
      0aae1000
    • Paul Wilkins's avatar
      Fix some bugs in multi-arf · 8160a26f
      Paul Wilkins authored
      Fix some bugs relating to the use of buffers
      in the overlay frames.
      
      Fix bug where a mid sequence overlay was
      propagating large partition and transform sizes into
      the subsequent frame because of :-
        sf->last_partitioning_redo_frequency  > 1 and
        sf->tx_size_search_method == USE_LARGESTALL
      
      Change-Id: Ibf9ef39a5a5150f8cbdd2c9275abb0316c67873a
      8160a26f
  20. 20 Jun, 2014 - 3 commits
  21. 12 Jun, 2014 - 3 commits
    • Dmitry Kovalev's avatar
      Replacing txfm_size with tx_size. · 4345d12d
      Dmitry Kovalev authored
      Change-Id: Ifa6374e9db5919322733b656e0865f5f19ee6f2c
      4345d12d
    • Jingning Han's avatar
      Fast computation path for forward transform and quantization · ccba289f
      Jingning Han authored
      This commit enables a fast path computational flow for forward
      transformation. It checks the sse and variance of prediction
      residuals and decides if the quantized coefficients are all
      zero, dc only, or more. It then selects the corresponding coding
      path in the forward transformation and quantization stage.
      
      It is currently enabled in rtc coding mode. Will do it for rd
      coding mode next.
      
      In speed -6, the runtime for pedestrian_area 1080p at 1000 kbps
      goes down from 14234 ms to 13704 ms, i.e., about 4% speed-up.
      Overall coding performance for rtc set is changed by -0.18%.
      
      Change-Id: I0452da1786d59bc8bcbe0a35fdae9f623d1d44e1
      ccba289f
    • Alex Converse's avatar
      Fix SEG_LVL_SKIP in non-RD inter mode selection. · 6c3f311b
      Alex Converse authored
      Add a set_mode_info_seg_skip function that fills the requisite mode info.
      
      Change-Id: I460b1b6845d720d9b09ed5b64df0ea0aac443f62
      6c3f311b
  22. 09 Jun, 2014 - 1 commit
    • Yunqing Wang's avatar
      Use small transform size in non-rd real-time mode · b04d7668
      Yunqing Wang authored
      In non-rd real-time mode, choosing smaller transform size in
      encoding gives better video quality and good speed gain than
      choosing larger transform size. This patch set tx size search
      method to ALLOW_8X8, which is better than using 4x4 or other
      larger sizes.
      
      Borg tests on rtc set at speed 6 showed significant gain on quality.
      PSNR gain: 11.034% and SSIM gain: 15.466%.
      
      The speed gain is 5% - 12% for <720p clips, and 2% - 7% for
      720p clips.
      
      Change-Id: If4dc74ed2df359346b059f47fb73b4a0193ec548
      b04d7668
  23. 06 Jun, 2014 - 2 commits
  24. 05 Jun, 2014 - 2 commits