1. 08 Aug, 2014 1 commit
    • levytamar82's avatar
      Fix bug 807 · 69a5f5ec
      levytamar82 authored
      in the sub_pixel_*variance* function the dst is aligned to 16 bytes and not
      to 32 bytes - now load unaligned data
      
      Change-Id: I2e0b9745543697efc56fefa32857ea10117af135
      69a5f5ec
  2. 01 Aug, 2014 1 commit
  3. 31 Jul, 2014 1 commit
  4. 30 Jul, 2014 1 commit
  5. 25 Jul, 2014 1 commit
  6. 10 Jul, 2014 1 commit
    • James Zern's avatar
      tests: add API_REGISTER_STATE_CHECK · 29e1b1a4
      James Zern authored
      used to wrap API functions to ensure full environment consistency as
      opposed to the renamed ASM_REGISTER_STATE_CHECK which is used with
      assembly functions.
      currently checks the FPU tag word in x86/x86_64 gcc builds to ensure
      emms has been called.
      
      Change-Id: Ie241772dbf903d33d516a1add4c8c6783f2e1490
      29e1b1a4
  7. 10 Jun, 2014 1 commit
  8. 08 May, 2014 1 commit
    • James Zern's avatar
      Revert "Removing redundant variables from variance_test.cc." · 6e5e75fa
      James Zern authored
      This reverts commit 4725ab7e.
      
      The constants are necessary to avoid breakage in vs9 builds:
       warning C4180: qualifier applied to function type has no meaning; ignored
       error C2436: 'f2_' : member function or nested class in constructor initializer list
       while compiling class template member function 'std::tr1::tuple<T0,T1,T2,T3,T4,T5,T6,T7,T8,T9>::tuple(const int &,const int &,unsigned int (__cdecl &))'
       ..\test\variance_test.cc : see reference to class template instantiation 'std::tr1::tuple<T0,T1,T2,T3,T4,T5,T6,T7,T8,T9>' being compiled
      
      Change-Id: Ia218b74fc473d40f02fee84cb7009adfbe82e5a7
      6e5e75fa
  9. 07 May, 2014 1 commit
  10. 27 Feb, 2014 1 commit
  11. 24 Jan, 2014 1 commit
  12. 18 Sep, 2013 1 commit
  13. 06 Sep, 2013 1 commit
    • Yaowu Xu's avatar
      cleanup cpplint warnings · afffa3d9
      Yaowu Xu authored
      Suggested by James Zern to clear out cpplint warnings for all unit
      test code.
      
      Change-Id: I731a3fa4d2a257eb9ef733426ba84286fbd7ea34
      afffa3d9
  14. 06 Aug, 2013 1 commit
  15. 27 Jun, 2013 1 commit
  16. 22 Jun, 2013 1 commit
  17. 21 Jun, 2013 1 commit
  18. 20 Jun, 2013 2 commits
    • Ronald S. Bultje's avatar
      SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance(). · 1e6a32f1
      Ronald S. Bultje authored
      Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to
      3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions
      which use a bilinear filter (x_offset & 7 || y_offset & 7) aren't
      perfectly interleaved, and can probably be improved further in the
      future. I've marked this with a few TODOs/FIXMEs in the code.
      
      Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9
      1e6a32f1
    • Ronald S. Bultje's avatar
      Implement sse2 and ssse3 versions for all sub_pixel_variance sizes. · 8fb6c581
      Ronald S. Bultje authored
      Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 ->
      3min58). Specific changes to timings for each function compared to
      original assembly-optimized versions (or just new version timings if
      no previous assembly-optimized version was available):
      
      sse2   4x4:    99 ->   82 cycles
      sse2   4x8:           128 cycles
      sse2   8x4:           121 cycles
      sse2   8x8:   149 ->  129 cycles
      sse2   8x16:  235 ->  245 cycles (?)
      sse2  16x8:   269 ->  203 cycles
      sse2  16x16:  441 ->  349 cycles
      sse2  16x32:          641 cycles
      sse2  32x16:          643 cycles
      sse2  32x32: 1733 -> 1154 cycles
      sse2  32x64:         2247 cycles
      sse2  64x32:         2323 cycles
      sse2  64x64: 6984 -> 4442 cycles
      
      ssse3  4x4:           100 cycles (?)
      ssse3  4x8:           103 cycles
      ssse3  8x4:            71 cycles
      ssse3  8x8:           147 cycles
      ssse3  8x16:          158 cycles
      ssse3 16x8:   188 ->  162 cycles
      ssse3 16x16:  316 ->  273 cycles
      ssse3 16x32:          535 cycles
      ssse3 32x16:          564 cycles
      ssse3 32x32:          973 cycles
      ssse3 32x64:         1930 cycles
      ssse3 64x32:         1922 cycles
      ssse3 64x64:         3760 cycles
      
      Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d
      8fb6c581
  19. 18 Jun, 2013 1 commit
  20. 22 May, 2013 1 commit
    • Yunqing Wang's avatar
      Optimize variance functions · f4fcfe30
      Yunqing Wang authored
      Added SSE2 version of variance functions for super blocks.
      
      Change-Id: Ibeaae8771ca21c99d41dd74067574a51e97b412d
      f4fcfe30
  21. 23 Feb, 2013 1 commit
  22. 27 Nov, 2012 1 commit
    • John Koleszar's avatar
      Add vp9_ prefix to all vp9 files · fcccbcbb
      John Koleszar authored
      Support for gyp which doesn't support multiple objects in the same
      static library having the same basename.
      
      Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc
      fcccbcbb
  23. 15 Nov, 2012 1 commit
  24. 07 Nov, 2012 1 commit
    • James Zern's avatar
      Fix variance (signed integer) overflow · 98473443
      James Zern authored
      In the variance calculations the difference is summed and later squared.
      When the sum exceeds sqrt(2^31) the value is treated as a negative when
      it is shifted which gives incorrect results.
      
      To fix this we force the multiplication to be unsigned.
      
      The alternative fix is to shift sum down by 4 before multiplying.
      However that will reduce precision.
      
      For 16x16 blocks the maximum sum is 65280 and sqrt(2^31) is 46340 (and
      change).
      
      This change is based on:
      16982342 Missed some variance casts
      fea3556e Fix variance overflow
      
      Change-Id: I2c61856cca9db54b9b81de83b4505ea81a050a0f
      98473443