1. 08 Oct, 2012 2 commits
  2. 10 Sep, 2012 1 commit
  3. 28 Aug, 2012 3 commits
  4. 27 Aug, 2012 1 commit
  5. 24 Aug, 2012 3 commits
  6. 21 Aug, 2012 1 commit
  7. 16 Aug, 2012 1 commit
  8. 15 Aug, 2012 1 commit
  9. 12 Aug, 2012 1 commit
  10. 02 Aug, 2012 1 commit
  11. 01 Aug, 2012 1 commit
  12. 18 Jul, 2012 2 commits
  13. 25 Jun, 2012 1 commit
  14. 12 Apr, 2012 1 commit
  15. 26 Mar, 2012 1 commit
  16. 25 Mar, 2012 1 commit
  17. 23 Feb, 2012 1 commit
  18. 02 Feb, 2012 1 commit
  19. 30 Jan, 2012 3 commits
    • Christophe Gisquet's avatar
      rv40: x86 SIMD for biweight · e5c9de2a
      Christophe Gisquet authored
      
      
      Provide MMX, SSE2 and SSSE3 versions, with a fast-path when the weights are
      multiples of 512 (which is often the case when the values round up nicely).
      
      *_TIMER report for the 16x16 and 8x8 cases:
      C:
      9015 decicycles in 16, 524257 runs, 31 skips
      2656 decicycles in 8, 524271 runs, 17 skips
      MMX:
      4156 decicycles in 16, 262090 runs, 54 skips
      1206 decicycles in 8, 262131 runs, 13 skips
      MMX on fast-path:
      2760 decicycles in 16, 524222 runs, 66 skips
      995 decicycles in 8, 524252 runs, 36 skips
      SSE2:
      2163 decicycles in 16, 262131 runs, 13 skips
      832 decicycles in 8, 262137 runs, 7 skips
      SSE2 with fast path:
      1783 decicycles in 16, 524276 runs, 12 skips
      711 decicycles in 8, 524283 runs, 5 skips
      SSSE3:
      2117 decicycles in 16, 262136 runs, 8 skips
      814 decicycles in 8, 262143 runs, 1 skips
      SSSE3 with fast path:
      1315 decicycles in 16, 524285 runs, 3 skips
      578 decicycles in 8, 524286 runs, 2 skips
      
      This means around a 4% speedup for some sequences.
      Signed-off-by: default avatarDiego Biurrun <diego@biurrun.de>
      e5c9de2a
    • Diego Biurrun's avatar
      91bafb52
    • Ronald S. Bultje's avatar
      png: convert DSP functions to yasm. · 59f474b4
      Ronald S. Bultje authored
      59f474b4
  20. 29 Jan, 2012 1 commit
  21. 12 Jan, 2012 1 commit
    • Christophe GISQUET's avatar
      rv34: DC-only inverse transform · 3faa303a
      Christophe GISQUET authored
      
      
      When decoding coefficients, detect whether the block is DC-only, and take
      advantage of this knowledge to perform DC-only inverse transform.
      
      This is achieved by:
      - first, changing the 108x4 element modulo_three_table into a 108 element
        table (kind of base4), and accessing each value using mask and shifts.
      - then, checking low bits for 0 (as they represent the presence of higher
        frequency coefficients)
      
      Also provide x86 SIMD code for the DC-only inverse transform.
      Signed-off-by: default avatarKostya Shishkov <kostya.shishkov@gmail.com>
      3faa303a
  22. 09 Jan, 2012 1 commit
  23. 19 Dec, 2011 1 commit
  24. 14 Dec, 2011 1 commit
  25. 11 Oct, 2011 1 commit
  26. 11 Aug, 2011 1 commit
  27. 03 Jul, 2011 1 commit
  28. 21 Jun, 2011 1 commit
  29. 18 Jun, 2011 1 commit
  30. 05 Jun, 2011 1 commit
  31. 31 May, 2011 1 commit
  32. 21 May, 2011 1 commit