1. 31 Oct, 2012 - 1 commit
  2. 29 Oct, 2012 - 2 commits
  3. 22 Oct, 2012 - 1 commit
  4. 16 Oct, 2012 - 1 commit
  5. 14 Oct, 2012 - 2 commits
  6. 11 Oct, 2012 - 1 commit
  7. 30 Aug, 2012 - 1 commit
    • Jingning Han's avatar
      hybrid transform of 16x16 dimension · de6dfa6b
      Jingning Han authored
      Enable ADST/DCT of dimension 16x16 for I16X16 modes. This change provides
      benefits mostly for hd sequences.
      
      Set up the framework for selectable transform dimension.
      
      Also allowing quantization parameter threshold to control the use
      of hybrid transform (This is currently disabled by setting threshold
      always above the quantization parameter. Adaptive thresholding can
      be built upon this, which will further improve the coding performance.)
      
      The coding performance gains (with respect to the codec that has all
      other configuration settings turned on) are
      
      derf:   0.013
      yt:     0.086
      hd:     0.198
      std-hd: 0.501
      
      Change-Id: Ibb4263a61fc74e0b3c345f54d73e8c73552bf926
      de6dfa6b
  8. 20 Aug, 2012 - 1 commit
    • Ronald S. Bultje's avatar
      Superblock coding. · 5d4cffb3
      Ronald S. Bultje authored
      This commit adds a pick_sb_mode() function which selects the best 32x32
      superblock coding mode. Then it selects the best per-MB modes, compares
      the two and encodes that in the bitstream.
      
      The bitstream coding is rather simplistic right now. At the SB level,
      we code a bit to indicate whether this block uses SB-coding (32x32
      prediction) or MB-coding (anything else), and then we follow with the
      actual modes. This could and should be modified in the future, but is
      omitted from this commit because it will likely involve reorganizing
      much more code rather than just adding SB coding, so it's better to let
      that be judged on its own merits.
      
      Gains on derf: about even, YT/HD: +0.75%, STD/HD: +1.5%.
      
      Change-Id: Iae313a7cbd8f75b3c66d04a68b991cb096eaaba6
      5d4cffb3
  9. 03 Aug, 2012 - 1 commit
    • Daniel Kang's avatar
      16x16 DCT blocks. · fed8a183
      Daniel Kang authored
      Set on all 16x16 intra/inter modes
      
      Features:
      - Butterfly fDCT/iDCT
      - Loop filter does not filter internal edges with 16x16
      - Optimize coefficient function
      - Update coefficient probability function
      - RD
      - Entropy stats
      - 16x16 is a config option
      
      Have not tested with experiments.
      
      hd:     2.60%
      std-hd: 2.43%
      yt:     1.32%
      derf:   0.60%
      
      Change-Id: I96fb090517c30c5da84bad4fae602c3ec0c58b1c
      fed8a183
  10. 17 Jul, 2012 - 1 commit
  11. 21 Mar, 2012 - 1 commit
    • Paul Wilkins's avatar
      Only support improved quant · c88d335f
      Paul Wilkins authored
      Deprecate fast quant and strict_quant code.
      Small effect on quality as fast was used in first pass but the
      effect is basically neutral across the derf set.
      
      The rationale here is to reduce the number of code paths for
      now to make experimentation easier. Optimized and fast code
      options can be re-introduced later along with other  encode
      speed options.
      
      Change-Id: Ia30c5daf3dbc52e72c83b277a1d281e3c934cdad
      c88d335f
  12. 15 Mar, 2012 - 1 commit
    • Yaowu Xu's avatar
      WebM Experimental Codec Branch Snapshot · 6035da54
      Yaowu Xu authored
      This is a code snapshot of experimental work currently ongoing for a
      next-generation codec.
      
      The codebase has been cut down considerably from the libvpx baseline.
      For example, we are currently only supporting VBR 2-pass rate control
      and have removed most of the code relating to coding speed, threading,
      error resilience, partitions and various other features.  This is in
      part to make the codebase easier to work on and experiment with, but
      also because we want to have an open discussion about how the bitstream
      will be structured and partitioned and not have that conversation
      constrained by past work.
      
      Our basic working pattern has been to initially encapsulate experiments
      using configure options linked to #IF CONFIG_XXX statements in the
      code. Once experiments have matured and we are reasonably happy that
      they give benefit and can be merged without breaking other experiments,
      we remove the conditional compile statements and merge them in.
      
      Current changes include:
      * T...
      6035da54
  13. 01 Mar, 2012 - 1 commit
  14. 16 Feb, 2012 - 1 commit
    • Paul Wilkins's avatar
      Code simplification · 79d330d7
      Paul Wilkins authored
      Removal of the pickinter.c and .h files and calls to this
      code.
      
      Removal of some code relating to real time and one pass
      settings  though there is more to be done in this regard.
      
      However,  vp8_set_speed_features() now
      only supports modes 0 and 1 and speeds up to 3
      so rd should always be set.
      
      Change-Id: I62c0c1b6154ab499785baef310536080e87bc4d8
      79d330d7
  15. 14 Feb, 2012 - 1 commit
  16. 09 Feb, 2012 - 1 commit
  17. 24 Oct, 2011 - 1 commit
    • Paul Wilkins's avatar
      Further segment feature extensions. · 01ce04bc
      Paul Wilkins authored
      This quite large check in includes the following:
      
      Merge in some code from Ronald (mbgraph.c) that scans a Gf/arf group.
      This is used as a basis for a simple segmentation for the normal frames
      in a gf/arf group. This code also uses satd functions from Yaowu.
      
      Adds functionality for coding the latest possible position of an EOB for
      blocks in the segment. (Currently 0-15 only, hence just for 4x4 dct).
      Where the EOB position is 0 this acts like "skip" and the normal coding
      of skip at the per mb level is disabled.
      
      Added functions (seg_common.c) for setting and reading segment feature
      elements. These may want to be optimized away at some point but while the
      mecahnism is in a state of flux they provide a single location for making
      changes and keep things a bit cleaner.
      
      This is still proof of concept code. Currently the tested feature set:-
      
      Quantizer,
      Loop Filter level,
      Reference frame,
      Prediction Mode,
      EOB end stop.
      
      TBD:-
      
      Add functions for setting and reading the feature...
      01ce04bc
  18. 22 Sep, 2011 - 1 commit
  19. 22 Aug, 2011 - 2 commits
  20. 19 Aug, 2011 - 1 commit
    • Fritz Koenig's avatar
      Reclasify optimized ssim calculations as SSE2. · 01376858
      Fritz Koenig authored
      Calculations were incorrectly classified as either
      SSE3 or SSSE3.  Only using SSE2 instructions.
      Cleanup function names and make non-RTCD code work
      as well.
      
      Change-Id: I29f5c2ead342b2086a468029c15e2c1d948b5d97
      01376858
  21. 22 Jul, 2011 - 1 commit
    • Yunqing Wang's avatar
      Preload reference area to an intermediate buffer in sub-pixel motion search · 20bd1446
      Yunqing Wang authored
      In sub-pixel motion search, the search range is small(+/- 3 pixels).
      Preload whole search area from reference buffer into a 32-byte
      aligned buffer. Then in search, load reference data from this buffer
      instead. This keeps data in cache, and reduces the crossing cache-
      line penalty. For tulip clip, tests on Intel Core2 Quad machine(linux)
      showed encoder speed improvement:
        3.4%   at --rt --cpu-used =-4
        2.8%   at --rt --cpu-used =-3
        2.3%   at --rt --cpu-used =-2
        2.2%   at --rt --cpu-used =-1
      
      Test on Atom notebook showed only 1.1% speed improvement(speed=-4).
      Test on Xeon machine also showed less improvement, since unaligned
      data access latency is greatly reduced in newer cores.
      
      Next, I will apply similar idea to other 2 sub-pixel search functions
      for encoding speed > 4.
      
      Make this change exclusively for x86 platforms.
      
      Change-Id: Ia7bb9f56169eac0f01009fe2b2f2ab5b61d2eb2f
      20bd1446
  22. 20 Jul, 2011 - 1 commit
  23. 09 Jun, 2011 - 1 commit
  24. 06 Jun, 2011 - 1 commit
    • Yaowu Xu's avatar
      remove redundant functions · d4700731
      Yaowu Xu authored
      The encoder defined about 4 set of similar functions to calculate sum,
      variance or sse or a combination of them. This commit removed one set
      of these functions, get8x8var and get16x16var, where calls to the later
      function are replaced with var16x16 by using the fact on a 16x16 MB:
          variance == sse - sum*sum/256
      
      Change-Id: I803eabd1fb3ab177780a40338cbd596dffaed267
      d4700731
  25. 01 Jun, 2011 - 1 commit
    • Tero Rintaluoma's avatar
      neon fast quantize block pair · 61f0c090
      Tero Rintaluoma authored
      vp8_fast_quantize_b_pair_neon function added to quantize
      two adjacent blocks at the same time to improve performance.
       - Additional 3-6% speedup compared to neon optimized fast
         quantizer (Tanya VGA@30fps, 1Mbps stream, cpu-used=-5..-16)
      
      Change-Id: I3fcbf141e5d05e9118c38ca37310458afbabaa4e
      61f0c090
  26. 09 May, 2011 - 1 commit
    • Yunqing Wang's avatar
      Use diamond search to replace full search in full-pixel refining search · cb7b1fb1
      Yunqing Wang authored
      In NEWMV mode, currently, full search is used as the refining search
      after n-step search. By replacing it with an iterative diamond search
      of radius 1 largely reduced the computation complexity, but still
      maintained the same encoding quality since the refining search is
      done for every macroblock instead of only a small precentage of
      macroblocks while using full search.
      
      Tests on the test set showed a 3.4% encoding speed increase with none
      psnr & ssim loss.
      
      Change-Id: Ife907d7eb9544d15c34f17dc6e4cfd97cb743d41
      cb7b1fb1
  27. 29 Apr, 2011 - 1 commit
  28. 11 Apr, 2011 - 1 commit
  29. 18 Mar, 2011 - 1 commit
    • John Koleszar's avatar
      Increase static linkage, remove unused functions · 429dc676
      John Koleszar authored
      A large number of functions were defined with external linkage, even
      though they were only used from within one file. This patch changes
      their linkage to static and removes the vp8_ prefix from their names,
      which should make it more obvious to the reader that the function is
      contained within the current translation unit. Functions that were
      not referenced were removed.
      
      These symbols were identified by:
      
        $ nm -A libvpx.a | sort -k3 | uniq -c -f2 | grep ' [A-Z] ' \
          | sort | grep '^ *1 '
      
      Change-Id: I59609f58ab65312012c047036ae1e0634f795779
      429dc676
  30. 11 Mar, 2011 - 1 commit
  31. 22 Feb, 2011 - 1 commit
  32. 10 Feb, 2011 - 1 commit
    • John Koleszar's avatar
      Fix relative include paths · 02321de0
      John Koleszar authored
      Allow compiling without adding vp8/{common,encoder,decoder} to the
      include paths.
      
      Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c
      02321de0
  33. 19 Jan, 2011 - 1 commit
    • Yaowu Xu's avatar
      experiment extending the quantizer range · 5b42ae09
      Yaowu Xu authored
      Prior to this change, VP8 min quantizer is 4, which caps the
      highest quality around 51DB. This experimental change extends
      the min quantizer to 1, removes the cap and allows the highest
      quality to be around ~73DB, consistent with the fdct/idct round trip
      error. To test this change, at configure time use options:
      
      --enable-experimental --enable-extend_qrange
      
      The following is a brief log of changes in each of the patch sets
      
      patch set 1:
      In this commit, the quantization/dequantization constants are kept
      unchanged, instead scaling factor 4 is rolled into fdct/idct.
      Fixed Q0 encoding tests on mobile:
        Before:    9560.567kbps Overall PSNR:50.255DB VPXSSIM:98.288
        Now:   18035.774kbps Overall PSNR:73.022DB VPXSSIM:99.991
      
      patch set 2:
      regenerated dc/ac quantizer lookup tables based on the scaling
      factor rolled in the fdct/idct. Also slightly extended the range
      towards the high quantizer end.
      
      patch set 3:
      slightly tweaked the quantizer tables and generated bits_per_mb
      table based on Paul's suggestions.
      
      patch set 4:
      fix a typo in idct, re-calculated tables relating active max Q
      to active min Q
      
      patch set 5:
      added rdmult lookup table based on Q
      
      patch set 6:
      fix rdmult scale: dct coefficient has scaled up by 4
      
      patch set 7:
      make transform coefficients to be within 16bits
      
      patch set 8:
      normalize 2nd order quantizers
      
      patch set 9:
      fix mis-spellings
      
      patch set 10:
      change the configure script and macros to allow experimental code
      to be enabled at configure time with --enable-extend_qrange
      
      patch set 11:
      rebase for merge
      
      Change-Id: Ib50641ddd44aba2a52ed890222c309faa31cc59c
      5b42ae09
  34. 18 Jan, 2011 - 1 commit
    • Attila Nagy's avatar
      Fix encoder real-time only configuration. · cb791aaa
      Attila Nagy authored
      Remove allocation/deallocation of stats storage.
      Remove full search functions in machine specific encoder inits.
      Remove last pass validation in  validate_config.
      
      Change-Id: I7f29be69273981a4fef6e80ecdb6217c68cbad4e
      cb791aaa
  35. 22 Dec, 2010 - 2 commits
    • Johann's avatar
      temporal filter naming changes · 4b6219cb
      Johann authored
      be more consistant with the naming pattern, especially wrt rtcd
      
      Change-Id: I3df50686a09f1dab0a9620b5adbb8a1577b40f2f
      4b6219cb
    • Johann's avatar
      abstract apply_temporal_filter · 092b5bef
      Johann authored
      allow for optimized versions of apply_temporal_filter
      (now vp8_apply_temporal_filter_c)
      
      the function was previously declared as static and appears to have been
      inlined. with this change, that's no longer possible. performance takes
      a small hit.
      
      the declaration for vp8_cx_temp_filter_c was moved to onyx_if.c because
      of a circular dependency. for rtcd, temporal_filter.h holds the
      definition for the rtcd table, so it needs to be included by onyx_int.h.
      however, onyx_int.h holds the definition for VP8_COMP which is needed
      for the function prototype. blah.
      
      Change-Id: I499c055fdc652ac4659c21c5a55fe10ceb7e95e3
      092b5bef
  36. 27 Oct, 2010 - 1 commit
    • Yunqing Wang's avatar
      Full search SAD function optimization in SSE4.1 · 71ecb5d7
      Yunqing Wang authored
      Use mpsadbw, and calculate 8 sad at once. Function list:
      vp8_sad16x16x8_sse4
      vp8_sad16x8x8_sse4
      vp8_sad8x16x8_sse4
      vp8_sad8x8x8_sse4
      vp8_sad4x4x8_sse4
      
      (test clip: tulip)
      For best quality mode, this gave encoder a 5% performance boost.
      For good quality mode with speed=1, this gave encoder a 3%
      performance boost.
      
      Change-Id: I083b5a39d39144f88dcbccbef95da6498e490134
      71ecb5d7