- 23 Jul, 2010 - 3 commits
-
-
Paul Wilkins authored
In two pass encodes, the calculation of the number of bits allocated to a KF group had the potential to overflow for high data rates if the interval is very long. We observed the problem in one test clip where there was one section where there was an 8000 frame gap between key frames. Change-Id: Ic48eb86271775d7573b4afd166b567b64f25b787
-
Timothy B. Terriberry authored
This replaces the approximate division-by-multiplication in the quantizer with an exact one that costs just one add and one shift extra. The asm versions have not been updated in this patch, and thus have been disabled, since the new method requires different multipliers which are not compatible with the old method. Change-Id: I53ac887af0f969d906e464c88b1f4be69c6b1206
-
Paul Wilkins authored
Tweaked table to fit to 80 characters. Change-Id: Ie6ba80e0b31b33e23d2bf78599abe223369fcefb
-
- 22 Jul, 2010 - 1 commit
-
-
Fritz Koenig authored
These files were out of date and no longer maintained. Token decoding has implemented the no-crash code which is incompatible with this arm assembly code. Change-Id: Ibf729886c56fca48181af60b44bda896c30023fc
-
- 19 Jul, 2010 - 3 commits
-
-
Paul Wilkins authored
Change submitted for Adrian Grange. Convert threshold calculation in ARNR filter to a lookup table. Change-Id: I12a4bbb96b9ce6231ce2a6ecc2d295610d49e7ec
-
Paul Wilkins authored
Change maximum ARNR filter width to 15. Change-Id: I3b72450ea08e96287445ec18810630ee2292954c
-
Paul Wilkins authored
Previously we had assumed that it was necessary to give a full frame's bit allocation to the alt ref frame if it has been created through temporal filtering. This is not the case. The active max quantizer control insures that sufficient bits are allocated if needed and allocating a full frame's worth of bits creates an excessive overhead for the ARF. Change-Id: I83c95ed7bc7ce0e53ccae6ff32db5a97f145937a
-
- 16 Jul, 2010 - 1 commit
-
-
Paul Wilkins authored
Change-Id: I37f10fbe4fbb505c1d34980a59af3e817c287e22
-
- 12 Jul, 2010 - 1 commit
-
-
Michael Kohler authored
-
- 07 Jul, 2010 - 2 commits
-
-
Yaowu Xu authored
The issue was caused by a bad merge in Change I5559d1e8 Change-Id: I6563f652bc1500202de361f8f51d11cc6ddf3331
-
Michael Kohler authored
Signed-off-by:
Michael Kohler <michaelkohler@live.com>
-
- 01 Jul, 2010 - 1 commit
-
-
Adrian Grange authored
In the case where the best reference mv is not (0,0) a secondary search is carried out centered on (0,0). However, rather than sending tmp_err into the search function, motion_error was inadvertently passed. As a result tmp_err remains set at INT_MAX and the (0,0)-based search result will never be selected, even if it is better. Change-Id: I3c82b246c8c82ba887b9d3fb4c9e0a0f2fe5a76c
-
- 30 Jun, 2010 - 2 commits
-
-
John Koleszar authored
Change I9fd1a5a4 updated the multithreaded loopfilter to avoid reinitializing several parameteres if they haven't changed from the last frame, but the code to update the last frame's parameters wasn't invoked in the multithreaded case. Change-Id: Ia23d937af625c01dd739608e02d110f742b7e1f2
-
Yunqing Wang authored
Modified loopfilter initialization to avoid unnecessary operations. Change-Id: I9fd1a5a49edc1cb8116c2a72a6908b1e437459ec
-
- 29 Jun, 2010 - 3 commits
-
-
Yunqing Wang authored
Restructured and rewrote SSE2 loopfilter functions. Combined u and v into one function to take advantage of SSE2 128-bit registers. Tests on test clips showed a 4% decoder performance improvement on Linux desktop. Change-Id: Iccc6669f09e17f2224da715f7547d6f93b0a4987
-
Paul Wilkins authored
Following conversations with Tim T (Derf) I ran a large number of tests comparing the existing polynomial expression with a simpler ^2 variant. Though the polynomial was sometimes a little better at the extremes of Q it was possible to get close for most clips and even a little better on some. This code also changes the way the RD multiplier is calculated when the ZBIN is extended to use a variant of the same ^2 expression. I hope that this simpler expression will be easier to tune further as we expand our test set and consider adjustments based on content. Change-Id: I73b2564346e74d1332c33e2c1964ae093437456c
-
Yaowu Xu authored
Besides the slight improvement in round trip error. This also fixes a sign bias in the forward transform, so the round trip errors are evenly distributed between +1s and -1s. The old bias seemed to work well with the dc sign bias in old fdct, which no longer exist in the improved fdct. Change-Id: I8635e7be16c69e69a8669eca5438550d23089cef
-
- 28 Jun, 2010 - 1 commit
-
-
Adrian Grange authored
Corrected setting of "which_buffer" for U & V cases to match that used for Y, i.e. to refer to the temporally most recent frame of those to be filtered. Change-Id: Idf94b287ef47a05f060da3e61134a0b616adcb6b
-
- 24 Jun, 2010 - 5 commits
-
-
Scott LaVarnway authored
Change-Id: Ib479210067510162879c368428b92690591120b2
-
Yaowu Xu authored
The new fdct lowers the round trip sum squared error for a 4x4 block ~0.12. or ~0.008/pixel. For reference, the old matrix multiply version has average round trip error 1.46 for a 4x4 block. Thanks to "derf" for his suggestions and references. Change-Id: I5559d1e81d333b319404ab16b336b739f87afc79
-
Fritz Koenig authored
bestsad needs to be a int and set to INT_MAX because at the end of the function it is compared to INT_MAX to determine if there was a match in the function. Change-Id: Ie80e88e4c4bb4a1ff9446079b794d14d5a219788
-
Fritz Koenig authored
bestsad should be an int initialized to INT_MAX. The optimized SAD function expects a signed value for bestsad to use for comparison and early loop termination. When no match is made, which is determined by a comparison of bestsad to INT_MAX, INT_MAX is returned.
-
John Koleszar authored
These are mostly vestigial, it's up to the compiler to decide what should be inlined, and this collided with certain Windows platform SDKs. Change-Id: I80dd35de25eda7773156e355b5aef8f7e44e179b
-
- 21 Jun, 2010 - 4 commits
-
-
agrange authored
1. Unavailability of each reference frame type should be tested independently, 2. Also, only the VP8_GOLD_FLAG needs to be tested before setting golden frame specific thresholds, and only VP8_ALT_FLAG needs testing before setting thresholds relevant to the AltRef frame. (Raised by gbvalor, in response to Issue 47) Change-Id: I6a06fc2a6592841d85422bc1661e33349bb6c3b8
-
agrange authored
Since the intent is to reset the appropriate bit in ref_frame_flags not to test a logic condition. Prior result would always have been ref_frame_flags being set to 0. (Issue reported by dgohman, issue 47) Change-Id: I2c12502ed74c73cf38e98c9680e0249c29e16433
-
agrange authored
The DOUBLE_DIVIDE_CHECK macro prevents from divide by 0, so must be on the denominator to work as intended. Change-Id: Ie109242d52dbb9a2c4bc1e11890fa51b5f87ffc7
-
Timothy B. Terriberry authored
If the version script produced by the libvpx build system is not used when linking a shared library on x86-64 Linux, the constant data in the subpel filters produces R_X86_64_32 relocation errors due to the use of wrt rip addressing instead of wrt rip wrt ..gotpcrel. Instead of adding a new macro for this addressing mode, this patch sets the ELF visibility of these symbols to "hidden", which allows wrt rip addressing to work without a text relocation. This allows building a shared library without using the provided build system or a separate version script. Fixes http://code.google.com/p/webm/issues/detail?id=46 Change-Id: Ie108f9d9a4352e5af46938bf4750d2302c1b2dc2
-
- 18 Jun, 2010 - 2 commits
-
-
Jim Bankoski authored
Remove a couple instructions from this function which weren't necessary for correct execution. Change-Id: Ib649674f140689f7e5c1530c35686241688a3151
-
John Koleszar authored
When the license headers were updated, they accidentally contained trailing whitespace, so unfortunately we have to touch all the files again. Change-Id: I236c05fade06589e417179c0444cb39b09e4200d
-
- 16 Jun, 2010 - 1 commit
-
-
Timothy B. Terriberry authored
Change bitreading functions to use a larger window which is refilled less often. This makes it cheap enough to do bounds checking each time the window is refilled, which avoids the need to copy the input into a large circular buffer. This uses less memory and speeds up the total decode time by 1.6% on an ARM11, 2.8% on a Cortex A8, and 2.2% on x86-32, but less than 1% on x86-64. Inlining vp8dx_bool_decoder_fill() has a big penalty on x86-32, as does moving the refill loop to the front of vp8dx_decode_bool(). However, having the refill loop between computation of the split values and the branch in vp8_decode_mb_tokens() is a big win on ARM (presumably due to memory latency and code size: refilling after normalization duplicates the code in the DECODE_AND_BRANCH_IF_ZERO and DECODE_AND_LOOP_IF_ZERO cases. Unfortunately, refilling at the end of vp8dx_bool_decoder_fill() and at the beginning of each decode step in vp8_decode_mb_tokens() means the latter requires an extra refill at the end. Platform-specific versions could avoid the problem, but would require most of detokenize.c to be duplicated. Change-Id: I16c782a63376f2a15b78f8086d899b987204c1c7
-
- 15 Jun, 2010 - 1 commit
-
-
Yunqing Wang authored
Add same fix in subpixel_sse2.asm. Change-Id: Icfda6103cbf74ec43308e96961dd738aa823c14d
-
- 14 Jun, 2010 - 3 commits
-
-
John Koleszar authored
Change-Id: I7b35f4717cdd204224112f72471b551617262417
-
Guillermo Ballester Valor authored
Change-Id: I2a97f08cc3c7808ce5be39e910cc5147ecf03a1d
-
Scott LaVarnway authored
Added sse2 version of vp8_regular_quantize_b which improved encode performance(for the clip used) by ~10% for 32 bit builds and ~3% for 64 bit builds. Also updated SHADOW_ARGS_TO_STACK to allow for more than 9 arguments. Change-Id: I62f78eabc8040b39f3ffdf21be175811e96b39af
-
- 12 Jun, 2010 - 1 commit
-
-
John Koleszar authored
This patch addresses issue #79, which is a regression since commit 28de670c "Fix RD bug." If the coded error value is zero, the iiratio calculation effectively multiplies by 1000000 by the DOUBLE_DIVIDE_CHECK macro. This can result in a value larger than INT_MAX, giving a negative ratio. Since the error values are conceptually unsigned (though they're stored in a double) this patch makes the iiratio values unsigned, which allows the clamping to work as expected.
-
- 11 Jun, 2010 - 5 commits
-
-
John Koleszar authored
Typo caused C version of 16x16x4 SAD to be called when built with --disable-runtime-cpu-detect. Change-Id: I0fe6fa67280b3a5f13acb3c8ed914f039aaaf316
-
John Koleszar authored
ssim.c comiles in a huge (512M) amount of global scratch space. Allocating this data on the heap would be a better solution, but this file doesn't need to be built at all in most cases, so as a first pass, disable it except when doing opsnr.stt output (--enable-psnr). Change-Id: I320d812f6d652a12516a16b52295ebff20b5bd42
-
Makoto Kato authored
XMM6 to XMM15 are non-volatile on Windows x64 ABI. We have to save these registers. Change-Id: I4676309f1350af25c8a35f0c81b1f0499ab99076
-
Paul Wilkins authored
(Thanks to Ronald S. Bultje)
-
Paul Wilkins authored
-