1. 08 Aug, 2013 - 1 commit
    • Adrian Grange's avatar
      Simplify & fix potential bug in rd_pick_partition · aae6a4c8
      Adrian Grange authored
      Different partitionings were not being evaluated against
      best_rd and there were unnecessary calls to RDCOST. This
      could have resulted in a non-optimal partioning being
      selected.
      
      I simplified the variables used to track the rate,
      distortion and RD values throughout the function.
      
      Change-Id: Ifa7085ee80d824e86791432a5bc6d8fea5a3e313
      aae6a4c8
  2. 07 Aug, 2013 - 1 commit
    • Jingning Han's avatar
      Use low precision 32x32fdct for encodemb in speed1 · debb9c68
      Jingning Han authored
      The low precision 32x32 fdct has all the intermediate steps within
      16-bit depth, hence allowing faster SSE2 implementation, at the
      expense of larger round-trip error. It was used in the rate-distortion
      optimization search loop only.
      
      Using the low precision version, in replace of the high precision one,
      affects the compression performance by about 0.7% (derf, stdhd) at
      speed 0. For speed 1, it makes derf set down by only 0.017%.
      
      Change-Id: I4e7d18fac5bea5317b91c8e7dabae143bc6b5c8b
      debb9c68
  3. 06 Aug, 2013 - 1 commit
  4. 05 Aug, 2013 - 5 commits
    • Dmitry Kovalev's avatar
      Finally removing all old block size constants. · b9c7d04e
      Dmitry Kovalev authored
      Change-Id: I3aae21e88b876d53ecc955260479980ffe04ad8d
      b9c7d04e
    • Deb Mukherjee's avatar
      Add variance based mode/skipping · 8b3faccb
      Deb Mukherjee authored
      Adds a speed feature to skip all intra modes other than
      DC_PRED if the source variance is small. This feature is
      made part of speed 1 and up.
      
      Results on derf300: psnr -0.07%, speedup about 1-2%
      
      Also uses the source variance to fine-tune the early
      termination criteria when FLAG_EARLY_TERMINATE is on.
      This feature is made part of speed 2 and up.
      
      Results on derf300: psnr -0.52%, speedup about 5-7%
      
      Change-Id: I59e38aa836557cfa5405ae706fc64815cbfe4232
      8b3faccb
    • Dmitry Kovalev's avatar
      Changing the order switchable filter enum constants. · 3f611555
      Dmitry Kovalev authored
      This changeset allows to remove vp9_switchable_interp and
      vp9_switchable_interp_map arrays and make code much clear. Actually we
      still have to use these mapping but only inside read_interp_filter_type and
      write_interp_filter_type functions.
      
      Change-Id: I4026c6f8c4acefba6c81421b7bacbaa52cc45f50
      3f611555
    • Jim Bankoski's avatar
      cleanups after bw bh code · 5d2cb7ea
      Jim Bankoski authored
      Cons bw/bh parms that should have been const. Additional formatting.
      
      Change-Id: Icd36a5c9dc17dadd7284315ac0d6fef1a565ca16
      5d2cb7ea
    • Dmitry Kovalev's avatar
      Replacing long block size enum values with shorter ones (2). · d007446b
      Dmitry Kovalev authored
      Change-Id: I428c4d42212b757112e3acfe5b81314cfbb5fd6b
      d007446b
  5. 03 Aug, 2013 - 1 commit
  6. 02 Aug, 2013 - 2 commits
  7. 01 Aug, 2013 - 7 commits
  8. 31 Jul, 2013 - 1 commit
    • Jingning Han's avatar
      Make the use of ref_frame index consistent · 86c384d3
      Jingning Han authored
      Refactor the frame buffer referencing in choose_partition and make
      it consistent with other places. This means to prevent potential
      issues when we extend reference frame buffer.
      
      Change-Id: I5ff33ed5f671e1f4cc7049622212769a9b4578d9
      86c384d3
  9. 30 Jul, 2013 - 1 commit
  10. 29 Jul, 2013 - 2 commits
  11. 26 Jul, 2013 - 1 commit
    • Paul Wilkins's avatar
      Auto min and max partition size experiment. · fe5e2a91
      Paul Wilkins authored
      Speed feature experiment to set an upper and lower
      partition size limit based on what has been seen
      in spatial neighbors.
      
      This seems to gives quite reasonable speed gains in local
      (10-15%) and when used with speed 0 the losses are small
      (0.25% derf, 0.35% stdhd). However, for now I am only
      enabling it on speed 1 as there may be clashes with the existing
      temporal partition selection in speed 2.
      
      Using a tighter min / max around the range derived from the
      neighbors increases speed further but at the cost of a
      bigger quality loss. However,  I think this spatial method could
      be combined with data from either the last frame or a variance
      method (or both) to refine the range of minimum and maximum
      partition size. I.e. consider the min and max from spatial and
      temporal neighbors and the variance recommendation.
      
      Change-Id: I1b96bf8b84368d6aad0c7aa600fe141b4f07435f
      fe5e2a91
  12. 25 Jul, 2013 - 4 commits
    • Yunqing Wang's avatar
      Add encoding option --static-thresh · d36852b7
      Yunqing Wang authored
      This option exists in VP8, and it was rewritten in VP9 to support
      skipping on different partition levels. After prediction is done,
      we can check if the residuals in the partition block will be all
      quantized to 0. If this is true, the skip flag is set, and only
      prediction data are needed in reconstruction. Based on DCT's energy
      conservation property, the skipping check can be estimated in
      spatial domain.
      
      The prediction error is calculated and compared to a threshold.
      The threshold is determined by the dequant values, and also
      adjusted by partition sizes. To be precise, the DC and AC parts
      for Y, U, and V planes are checked to decide skipping or not.
      
      Test showed that
      1. derf set:
      when static-thresh = 1, psnr loss is 0.666%;
      when static-thresh = 500, psnr loss is 1.162%;
      2. stdhd set:
      when static-thresh = 1, psnr loss is 1.249%;
      when static-thresh = 500, psnr loss is 1.668%;
      
      For different clips, encoding speedup range is between several
      percentage and 20+% when static-thresh <= 500. For example,
      clip            bitrate  static-thresh psnr    time
      akiyo(cif)       500        0          48.923  5.635s(50f)
      akiyo            500        500        48.863  4.402s(50f)
      
      parkjoy(1080p)   4000       0          30.380  77.54s(30f)
      parkjoy          4000       500        30.384  69.59s(30f)
      
      sunflower(1080p) 4000       0          44.461  85.2s(30f)
      sunflower        4000       500        44.418  78.1s(30f)
      
      Higher static-thresh values give larger speedup with larger
      quality loss.
      
      Change-Id: I857031ceb466ff314ab580ac5ec5d18542203c53
      d36852b7
    • Dmitry Kovalev's avatar
      General cleanups. · 7131cb0e
      Dmitry Kovalev authored
      Removing unused constants, macros, and function declarations. Using
      ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving
      #include from *.h to *.c. Merging for loops for motion vectors.
      
      Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13
      7131cb0e
    • Adrian Grange's avatar
      Simplify handling of sub-partition motion vectors · be700e14
      Adrian Grange authored
      Simplified the code that extracts and uses the motion
      vectors for the 4 sub-partitions in rd_pick_partition.
      
      Change-Id: Iaf698ef7ee3aef9edd59015e1ae065dd359b17d9
      be700e14
    • Yaowu Xu's avatar
      fix a bug where flags are not reset · 3e386aef
      Yaowu Xu authored
      The feature that uses small partition results as a measure to skip
      mode evaluation at larger partition requires the flags to be reset.
      The reset was missing in the code path that calls rd_use_partition().
      
      Change-Id: Ia0a3a0aee1a862b6e2333d596808db7c48033d50
      3e386aef
  13. 24 Jul, 2013 - 2 commits
  14. 23 Jul, 2013 - 2 commits
    • Adrian Grange's avatar
      Rolled-up several for loops into one · 646edbc1
      Adrian Grange authored
      Several consecutive for loops executed over the same
      index range, so I rolled them into one.
      
      Change-Id: I5cfcc8c38c738478965768409cca9d09adf224e1
      646edbc1
    • Jim Bankoski's avatar
      clean up bw, bh · 86a9dec7
      Jim Bankoski authored
      many structures use bw and bh and they have different meanings.   This cl attempts
      to start this clean up and remove unneccessary 2 step look up log and then
      shift operations...
      
      also removed partition type multiple operation code in bitstream.c.
      
      Change-Id: I7e03e552bdfc0939738e430862e3073d30fdd5db
      86a9dec7
  15. 22 Jul, 2013 - 2 commits
    • Dmitry Kovalev's avatar
      Adding update_tx_counts function. · b2fc6fa9
      Dmitry Kovalev authored
      Moving common encoder/decoder code to update_tx_counts. Also renaming
      vp9_get_pred_probs_tx_size to get_tx_probs2 and adding get_tx_probs to
      call vp9_get_pred_context_tx_size inside read_selected_tx_size only once
      (twice before).
      
      Change-Id: Ia50247f3893de88ef8e9041b0d44be44a40aaa4d
      b2fc6fa9
    • Jim Bankoski's avatar
      fix left over overflow · 2ac8b50c
      Jim Bankoski authored
      This cl fixes issues rbultje brought up. that I somehow neglected when I
      submitted yaowu's patch.
      
      Change-Id: I07ad18796317822510b96e951c88d29f194a3c2e
      2ac8b50c
  16. 20 Jul, 2013 - 1 commit
    • Yaowu Xu's avatar
      added checks to prevent rate/distortion overflow · ea284d62
      Yaowu Xu authored
      At speed 2, due to the threshold scheme used, it is possible the rate
      and distortion assigned with INT_MAX value. The patch added checking
      to prevent the INT_MAX value is used in further calculation of RD
      scores. The patch also changed the assertion in rd_use_partition() to
      be mirror similar assertion in rd_pick_partition().
      
      Change-Id: Idb52c543cc1e10abdf6e6a5d6e9cb535a42214dc
      ea284d62
  17. 19 Jul, 2013 - 5 commits
  18. 18 Jul, 2013 - 1 commit
    • Ronald S. Bultje's avatar
      Merge scale_factors and scale_factors_uv. · 5ebe503f
      Ronald S. Bultje authored
      This prevents a duplicate memcpy of a 128-byte struct every time
      set_scale_factors() is called (which is a lot), thus leading to a
      decrease from 3.7 MB to 1.85 MB of struct copying per 64x64 block
      RD/partition loop.
      
      Overall, this decreases encoding time of the first 50 frames of bus
      @ 1500kbps (speed 0) from 1min5.9 to 1min4.9, i.e. about a 1.5%
      overall speedup. We can likely get more gains by removing the copy
      of the other struct (and replacing it with an indexing) as well.
      
      Change-Id: I3dceb7e79f71e6fe911b11cc994cf89a869dde7a
      5ebe503f