1. 08 Jan, 2014 - 2 commits
    • levytamar82's avatar
      AVX2 Variance Optimization · 357b6536
      levytamar82 authored
      Optimizing the variance functions: vp9_variance16x16, vp9_variance32x32,
      vp9_variance64x64, vp9_variance32x16, vp9_variance64x32,
      vp9_mse16x16 by migrating to AVX2
      some of the functions were optimized by processing 32 elements instead of 16.
      some of the functions were optimized by processing 2 loop strides of 16
      elements in a single 256 bit register
      This optimization gives between 2.4% - 2.7% user level performance gain
      and 42% function level gain.
      
      Change-Id: I265ae08a2b0196057a224a86450153ef3aebd85d
      357b6536
    • Alex Converse's avatar
      Replace RD modeling with a fixed point approximation. · f2ca665f
      Alex Converse authored
      Change-Id: I44eb44eb3f36c05d916ef140ef42cc84f72f99ec
      f2ca665f
  2. 07 Jan, 2014 - 8 commits
  3. 06 Jan, 2014 - 8 commits
  4. 04 Jan, 2014 - 2 commits
  5. 03 Jan, 2014 - 10 commits
    • Jingning Han's avatar
      Tune IDCT8_1D macro function interface · 3e0c62b5
      Jingning Han authored
      This commit adds input/output ports for IDCT8_1D macro function to
      provide more flexibility in variable use. It allows to skip several
      buffer swap operations.
      
      Change-Id: I21f3450509537322293043b3281bfd3949868677
      3e0c62b5
    • Dmitry Kovalev's avatar
      Adding RefBuffer struct. · ba41e9d4
      Dmitry Kovalev authored
      Adding RefBuffer to simplify reference buffer management. The struct has a
      pointer to image data and scale factors relative to the current frame.
      
      Change-Id: If38eb1491ff687cc11428aee339f3e052e2c5d9e
      ba41e9d4
    • Dmitry Kovalev's avatar
      Pre planes configuration cleanup. · a8ba34d2
      Dmitry Kovalev authored
      Change-Id: I1d50f8701d9c9dedb84387a773a3e9b4daaad720
      a8ba34d2
    • Jingning Han's avatar
      Reduce num of buffer swap calls in idct8_1d_sse2 · 0b1a2713
      Jingning Han authored
      This commit merges the initial buffer swap operations in idct8_1d_sse2
      into the array transpose step, hence reducing number of instructions
      therein.
      
      Change-Id: I219f6f50813390d2ec3ee37eecf2a4a2b44ae479
      0b1a2713
    • Dmitry Kovalev's avatar
      Cleaning up get_prediction_decay_rate() function. · 84520829
      Dmitry Kovalev authored
      Change-Id: Ie8fcee21f41f91f94b4fa02f2a55691dea1734e3
      84520829
    • Jingning Han's avatar
      Rework idct8x8_10 SSE2 implementation · 1bb11781
      Jingning Han authored
      This commit optimizes the SSE2 implmentation of idct8x8_10. It exploits
      the fact that only top-left 4x4 block contains non-zero coefficients,
      and hence reduces the instructions needed.
      
      The runtime of idct8x8_10_sse2 goes down from 216 to 198 CPU cycles,
      estimated by averaging over 100000 runs. For pedestrian_area_1080p 300
      frames coded at 4000kbps, the average decoding speed goes up from
      79.3 fps to 79.7 fps.
      
      Change-Id: I6d277bbaa3ec9e1562667906975bae06904cb180
      1bb11781
    • Dmitry Kovalev's avatar
      Replacing int_mv with MV. · 672c355a
      Dmitry Kovalev authored
      Change-Id: Ifd432fa3741ba47102d298e0b348eb00f5a9ce53
      672c355a
    • Dmitry Kovalev's avatar
      Merging best_ref_mv and second_best_ref_mv into best_ref_mv[2]. · 5b04962c
      Dmitry Kovalev authored
      Change-Id: If04b57828847cee09a79c94e1098d1aa4990ea0d
      5b04962c
    • Paul Wilkins's avatar
      Modified Handling of min and max vbr rates. · 65ede3da
      Paul Wilkins authored
      In two pass encodes bits are allocated to each frame
      according to a modified error score for the frame as a
      fraction of the modified error score for the clip or section.
      
      Previously a minimum rate per frame was reserved and
      subtracted from the bits allocatable by the two pass code.
      The vbr max section rate was enforced by clipping the
      actual number of bits allocated.
      
      In this patch the min and max vbr rates are enforced
      instead by clipping the modified error scores for each frame
      rather than the number of bits allocated.
      
      Small gains for all test sets (psnr and SSIM) ranging from
      ~ +0.05 for YT psnr up to ~ +0.25 for Std-hd SSIM.
      
      Change-Id: Iae27d70bdd3944e3f0cceaf225bad2e8802833de
      65ede3da
    • Dmitry Kovalev's avatar
      Reusing vp9_get_skip_context() function in encoder. · f16b186b
      Dmitry Kovalev authored
      Change-Id: Ic0345622115941f49b6a568c7b8154ba892cbf0d
      f16b186b
  6. 27 Dec, 2013 - 3 commits
    • Dmitry Kovalev's avatar
      Using VP9_FRAME_MARKER instead of raw number. · 46d5cc43
      Dmitry Kovalev authored
      Change-Id: I3addbf6d89a86a707c8df1a463da3e9e367910df
      46d5cc43
    • Dmitry Kovalev's avatar
      Removing vpx_codec_vp9x_cx and internal experimental flag. · 116e0a1a
      Dmitry Kovalev authored
      vpx_codec_vp9x_cx is not used internally. Experimental flag from
      vp9_extracfg is also not really used. YUV 4:4:4 just works after these
      changes (you have to specify --profile=1 for the encoder).
      
      Change-Id: Ib1c8461d0d19d159827e005efe868f891eea0140
      116e0a1a
    • Jingning Han's avatar
      Adaptive motion control on ref and search range · a4ce53f1
      Jingning Han authored
      This commit takes a preliminary attempt to refine the motion search
      control. It detects the SAD associated with mv predictor per reference
      frame, and based on which to determine whether the encoder wants to
      reduce the motion search range (if the predicted mv provides fairly
      small SAD), or to skip the current reference frame (if there exists
      another ref frame that gives much smaller SAD cost).
      
      This feature is turned on in the settings of speed 1 and above.
      
      In speed 1, compression performance changed
      derf  -0.018%
      yt    -0.043%
      hd    -0.045%
      stdhd -0.281%
      
      speed-up
      pedestrian_area_1080p at 4000 kbps 100 frames
      199651ms -> 188846ms (5.5% speed-up)
      blue_sky_1080p at 6000 kbps
      443531ms -> 415239ms (6.3% speed-up)
      
      In speed 2, compression performance changed
      derf  -0.026%
      yt    -0.090%
      hd    -0.055%
      stdhd -0.210%
      
      speed-up
      pedstrian 113949ms -> 108855ms (4.5% speed-up)
      blue_sky  271057ms -> 257322ms (5% speed-up)
      
      Change-Id: I1b74ea28278c94fea329d971d706d573983d810d
      a4ce53f1
  7. 20 Dec, 2013 - 7 commits