- 12 Mar, 2013 - 1 commit
-
-
Dmitry Kovalev authored
Removing redundant code, introducing new functions for better decomposition, adding 'clamp' function to vp9_common.h. Change-Id: Ic3b8ca13bbc38f60f0c9c43910b5802005e31aaf
-
- 11 Mar, 2013 - 6 commits
-
-
John Koleszar authored
The automatic merge result was incomplete. Change-Id: I8976318bfc346d867660a013a302c80edb25fc29
-
John Koleszar authored
-
John Koleszar authored
Remove the temporary branch count arrays and build the adapted probabilities while walking the tree. Gives an additional 1.5% or so on CIF. Change-Id: I875d61e5e0ec778e5d2f7f9d0837b989a91cf3a3
-
Deb Mukherjee authored
-
John Koleszar authored
-
Deb Mukherjee authored
Adds a check to exit from the increment_nmv_count function when the increment is 0. Change-Id: I99c1e342d351f7800e23590f9c2419881bf1d708
-
- 10 Mar, 2013 - 1 commit
-
-
John Koleszar authored
The previous implementation visited each node in the tree multiple times because it used each symbol's encoding to revisit the branches taken and increment its count. Instead, we can traverse the tree depth first and calculate the probabilities and branch counts as we walk back up. The complexity goes from somewhere between O(nlogn) and O(n^2) (depending on how balanced the tree is) to O(n). Only tested one clip (256kbps, CIF), saw 13% decoding perf improvement. Note that this optimization should port trivially to VP8 as well. In VP8, the decoder doesn't use this function, but it does routinely show up on the profile for realtime encoding. Change-Id: I4f2848e4f41dc9a7694f73f3e75034bce08d1b12
-
- 09 Mar, 2013 - 2 commits
-
-
Deb Mukherjee authored
Adds probability updates for extra bits for the nzcs, code for getting nzc stats, plus some minor cleanups and fixes. Change-Id: If2814e7f04fb52f5025ad9f400f3e6c50a00b543
-
Ronald S. Bultje authored
-
- 08 Mar, 2013 - 6 commits
-
-
Yunqing Wang authored
-
Yunqing Wang authored
Added SSE2 idct4_1d which is called by vp9_short_iht4x4. Also, modified the parameter type passed to vp9_short_iht functions to make it work with rtcd prototype. Change-Id: I81ba7cb4db6738f1923383b52a06deb760923ffe
-
Dmitry Kovalev authored
-
Yunqing Wang authored
-
Jingning Han authored
Increase the motion search range by 4x. Change MV_CLASS tree of the entropy coding to allow two additional mv classes to cover the extended motion vector limit. The codec determines the effective motion search range conditioned on the actual frame dimension. It provides coding gains: stdhd 0.39% yt 0.56% hd 0.47% Major coding performance gains are packed in several sequences with intense motion activities, e.g., ped_1080p gains 7% at high bit-rates, and on average 3%. TODO: Need to further tune the rate control and motion search units. Change-Id: Ib842540a6796fbee5a797809433ef6a477c6d78d
-
Ronald S. Bultje authored
Also enable tx_select for keyframes. Change-Id: Iadb1231d9fa7af0c8dce3d9b41830b93a302479e
-
- 07 Mar, 2013 - 9 commits
-
-
Yunqing Wang authored
Optimized adding constant diff to predictor, which gave about 2% decoder performance gain. Change-Id: I47db20c31428e8c4a8f16214a85cbe386a6e9303
-
Yunqing Wang authored
-
Yunqing Wang authored
This was done based on John's suggestion. Change-Id: I62516a513c31fe3dbea0d6cd063df79d9e819ec8
-
Dmitry Kovalev authored
Change-Id: I44660975e9985310d8c654c158ee7a61291b5a08
-
Ronald S. Bultje authored
Change-Id: Ic9b336486774c95ffbb92adcb110cc0fc2a83cc5
-
Ronald S. Bultje authored
This also changes the RD search to take account of the correct block index when searching (this is required for ADST positioning to work correctly in combination with tx_select). Change-Id: Ie50d05b3a024a64ecd0b376887aa38ac5f7b6af6
-
Yunqing Wang authored
Yaowu found this function had a compiling issue with MSVC because of using _mm_storel_pi((__m64 *)(dest + 0 * stride), (__m128)p0). To be safe, changed back to use integer store instruction. Also, for some build, diff could not always be 16-byte aligned. Changed that in the code. Change-Id: I9995e5446af15dad18f3c5c0bad1ae68abef6c0d
-
Deb Mukherjee authored
This patch revamps the entropy coding of coefficients to code first a non-zero count per coded block and correspondingly remove the EOB token from the token set. STATUS: Main encode/decode code achieving encode/decode sync - done. Forward and backward probability updates to the nzcs - done. Rd costing updates for nzcs - done. Note: The dynamic progrmaming apporach used in trellis quantization is not exactly compatible with nzcs. A suboptimal approach has been used instead where branch costs are updated to account for changes in the nzcs. TODO: Training the default probs/counts for nzcs Change-Id: I951bc1e22f47885077a7453a09b0493daa77883d
-
Dmitry Kovalev authored
-
- 06 Mar, 2013 - 4 commits
-
-
Paul Wilkins authored
-
Paul Wilkins authored
Added a variant of the one shot maxQ flag for two pass that forces a fixed Q for the normal inter frames. Disabled by default. Also small adjustment to the Bits per MB estimation. Change-Id: I87efdfb2d094fe1340ca9ddae37470d7b278c8b8
-
Yunqing Wang authored
-
Yunqing Wang authored
Optimized adding diff to predictor, which gave 0.8% decoder performance gain. Change-Id: Ic920f0baa8cbd13a73fa77b7f9da83b58749f0f8
-
- 05 Mar, 2013 - 5 commits
-
-
Dmitry Kovalev authored
Removing redundant 'extern' keywords, fixing formatting and #include order, code simplification. Change-Id: I0e5fdc8009010f3f885f13b5d76859b9da511758
-
Ronald S. Bultje authored
* changes: vpxenc: actually report mismatch on stderr. Make superblocks independent of macroblock code and data.
-
Dmitry Kovalev authored
-
Ronald S. Bultje authored
Because ctx->err is not set in that case, it will not report the error on stderr. Change-Id: Ifacbf5a03e676fd56522b03c0281d6c723c563ee
-
Ronald S. Bultje authored
Split macroblock and superblock tokenization and detokenization functions and coefficient-related data structs so that the bitstream layout and related code of superblock coefficients looks less like it's a hack to fit macroblocks in superblocks. In addition, unify chroma transform size selection from luma transform size (i.e. always use the same size, as long as it fits the predictor); in practice, this means 32x32 and 64x64 superblocks using the 16x16 luma transform will now use the 16x16 (instead of the 8x8) chroma transform, and 64x64 superblocks using the 32x32 luma transform will now use the 32x32 (instead of the 16x16) chroma transform. Lastly, add a trellis optimize function for 32x32 transform blocks. HD gains about 0.3%, STDHD about 0.15% and derf about 0.1%. There's a few negative points here and there that I might want to analyze a little closer. Change-Id: Ibad7c3ddfe1acfc52771dfc27c03e9783e054430
-
- 04 Mar, 2013 - 6 commits
-
-
Dmitry Kovalev authored
-
Yunqing Wang authored
-
Yaowu Xu authored
-
Ronald S. Bultje authored
Change-Id: I5637d491eb6a9b7633f72e03fd9df72131eeb121
-
Yunqing Wang authored
Wrote a SSE2 vp9_short_idct4x4llm to improve the decoder performance. Change-Id: I90b9d48c4bf37aaf47995bffe7e584e6d4a2c000
-
Jingning Han authored
Fixed a couple of variable/function definitions, as well as header handling to support 16K sequence coding at high bit-rates. The width and height are each specified by two bytes in the header. Use an extra byte to explicitly indicate the scaling factors in both directions, each ranging from 0 to 15. Tested coding up to 16400x16400 dimension. Change-Id: Ibc2225c6036620270f2c0cf5172d1760aaec10ec
-