1. 03 Jun, 2015 2 commits
  2. 02 Jun, 2015 2 commits
  3. 01 Jun, 2015 1 commit
  4. 29 May, 2015 3 commits
  5. 26 May, 2015 1 commit
  6. 15 May, 2015 1 commit
  7. 08 May, 2015 1 commit
  8. 07 May, 2015 2 commits
  9. 06 May, 2015 1 commit
    • Johann's avatar
      Move shared SAD code to vpx_dsp · d5d92898
      Johann authored
      Create a new component, vpx_dsp, for code that can be shared
      between codecs. Move the SAD code into the component.
      This reduces the size of vpxenc/dec by 36k on x86_64 builds.
      Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a
  10. 05 May, 2015 2 commits
  11. 30 Apr, 2015 2 commits
    • hkuang's avatar
      Add some sse2 code for intra prediction. · 493a8579
      hkuang authored
      Change-Id: I16c0a62e52dab62837c547345df31e7518620ed4
    • Yaowu Xu's avatar
      Remove vp9_idct16x16_10_add_ssse3() · 47767609
      Yaowu Xu authored
      The rotation computation using 2X of cos(pi/16) has a potential to
      overflow 32 bit, this commit disable the function to allow further
      investigation and optimization.
      Change-Id: I4a9803bc71303d459cb1ec5bbd7c4aaf8968e5cf
  12. 29 Apr, 2015 2 commits
  13. 21 Apr, 2015 2 commits
  14. 18 Apr, 2015 1 commit
  15. 17 Apr, 2015 1 commit
  16. 14 Apr, 2015 1 commit
  17. 13 Apr, 2015 1 commit
    • Marco's avatar
      Force_split on 16x16 blocks in variance partition. · eb8c6675
      Marco authored
      Force split on 16x16 block (to 8x8) based on the minmax over the 8x8 sub-blocks.
      Also increase variance threshold for 32x32, and add exit condiiton in choose_partition
      (with very safe threshold) based on sad used to select reference frame.
      Some visual improvement near moving boundaries.
      Average gain in psnr/ssim: ~0.6%, some clips go up ~1 or 2%.
      Encoding time increase (due to more 8x8 blocks) from ~1-4%, depending on clip.
      Change-Id: I4759bb181251ac41517cd45e326ce2997dadb577
  18. 04 Apr, 2015 1 commit
  19. 02 Apr, 2015 1 commit
    • James Zern's avatar
      vp9: fix high-bitdepth NEON build · d181a627
      James Zern authored
      remove incorrect specializations in rtcd and update a configuration
      check in partial_idct_test.cc
      (cherry picked from commit 88453340)
      Change-Id: I20f551f38ce502092b476fb16d3ca0969dba56f0
  20. 01 Apr, 2015 3 commits
    • Jingning Han's avatar
      Refactor block_yrd function for RTC coding mode · 1470529f
      Jingning Han authored
      This commit separates Hadamard transform/quantization operations
      from rate and distortion computation in block_yrd. This allows one
      to skip SATD computation when all transform blocks are quantized
      to zero. It also uses a new block error function that skips
      repeated computation of sum of squared residuals. It reduces the
      CPU cycles spent on block error calculation in block_yrd by 40%.
      Change-Id: I726acb2454b44af1c3bd95385abecac209959b10
    • James Zern's avatar
      vp9: enable sse4 sad functions · 14e24a12
      James Zern authored
      sse4 isn't set by configure or used in rtcd, correct the sad entries to
      use sse4_1 without changing the signatures for now.
      this was done in vp8 post-vp9 branch.
      Change-Id: Ia9f1fff9f2476fdfa53ed022778dd2f708caa271
    • James Zern's avatar
      vp9: fix high-bitdepth NEON build · 88453340
      James Zern authored
      remove incorrect specializations in rtcd and update a configuration
      check in partial_idct_test.cc
      Change-Id: I20f551f38ce502092b476fb16d3ca0969dba56f0
  21. 30 Mar, 2015 2 commits
    • Jingning Han's avatar
      Enable 16x16 Hadamard transform in SATD based mode decision · 26d3d3af
      Jingning Han authored
      This commit replaces the 16x16 2D-DCT transform with Hadamard
      transform for RTC coding mode. It reduces the CPU cycles cost
      on 16x16 transform by 5X. Overall it makes the speed -6 encoding
      speed 1.5% faster without compromise on compression performance.
      Change-Id: If6c993831dc4c678d841edc804ff395ed37f2a1b
    • Jingning Han's avatar
      Hadamard transform based coding mode decision process · 8c411f74
      Jingning Han authored
      This commit uses Hadamard transform based rate-distortion cost
      estimate for rtc coding mode decision. It improves the compression
      performance of speed -6 for many hard clips at lower bit-rates.
      For example, 5.5% for jimredvga, 6.7% for mmmoving, 6.1% for
      niklas720p. This will introduce extra encoding cycle costs at
      this point.
      Change-Id: Iaf70634fa2417a705ee29f2456175b981db3d375
  22. 01 Mar, 2015 1 commit
    • Jingning Han's avatar
      Use variance metric for integral projection vector match · 1790d452
      Jingning Han authored
      This commit replaces the SAD with variance as metric for the
      integral projection vector match. It improves the search accuracy
      in the presence of slight light change. The average speed -6
      compression performance for rtc set is improved by 1.7%. No speed
      changes are observed for the test clips.
      Change-Id: I71c1d27e42de2aa429fb3564e6549bba1c7d6d4d
  23. 19 Feb, 2015 1 commit
    • Jingning Han's avatar
      Integral projection based motion estimation · ed2dc59c
      Jingning Han authored
      This commit introduces a new block match motion estimation
      using integral projection measurement. The 2-D block and the nearby
      region is projected onto the horizontal and vertical 1-D vectors,
      respectively. It then runs vector match, instead of block match,
      over the two separate 1-D vectors to locate the motion compensated
      reference block.
      This process is run per 64x64 block to align the reference before
      choosing partitioning in speed 6. The overall CPU cycle cost due
      to this additional 64x64 block match (SSE2 version) takes around 2%
      at low bit-rate rtc speed 6. When strong motion activities exist in
      the video sequence, it substantially improves the partition
      selection accuracy, thereby achieving better compression performance
      and lower CPU cycles.
      The experiments were tested in RTC speed -6 setting:
      cloud 1080p 500 kbps
      17006 b/f, 37.086 dB, 5386 ms ->
      16669 b/f, 37.970 dB, 5085 ms (>0.9dB gain and 6% faster)
      pedestrian_area 1080p 500 kbps
      53537 b/f, 36.771 dB, 18706 ms ->
      51897 b/f, 36.792 dB, 18585 ms (4% bit-rate savings)
      blue_sky 1080p 500 kbps
      70214 b/f, 33.600 dB, 13979 ms ->
      53885 b/f, 33.645 dB, 10878 ms (30% bit-rate savings, 25% faster)
      jimred 400 kbps
      13380 b/f, 36.014 dB, 5723 ms ->
      13377 b/f, 36.087 dB, 5831 ms  (2% bit-rate savings, 2% slower)
      Change-Id: Iffdb6ea5b16b77016bfa3dd3904d284168ae649c
  24. 27 Jan, 2015 3 commits
  25. 25 Jan, 2015 1 commit
    • Frank Galligan's avatar
      Add Neon intrinsic vp9_fdct8x8_quant_neon · 9f6eba41
      Frank Galligan authored
      On Nexus 7 speed -5 got ~2%, -6 got ~15%, -7 and -8 got ~30%
      increase in perf.
      Tested on Nexus 7, built with ndk r10d, gcc 4.9.
      Change-Id: I83246d63b96674d170098a572fa4fe28a05aaf51
  26. 19 Jan, 2015 1 commit
    • JackyChen's avatar
      SSE2 code for the filter in MFQE. · 09673deb
      JackyChen authored
      The SSE2 code is from VP8 MFQE, reuse it in VP9. No change on VP8
      side. In our testing, we achieve 2X speed by adopting this change.
      Change-Id: Ib2b14144ae57c892005c1c4b84e3379d02e56716