- 16 Jun, 2010 - 1 commit
-
-
Timothy B. Terriberry authored
Change bitreading functions to use a larger window which is refilled less often. This makes it cheap enough to do bounds checking each time the window is refilled, which avoids the need to copy the input into a large circular buffer. This uses less memory and speeds up the total decode time by 1.6% on an ARM11, 2.8% on a Cortex A8, and 2.2% on x86-32, but less than 1% on x86-64. Inlining vp8dx_bool_decoder_fill() has a big penalty on x86-32, as does moving the refill loop to the front of vp8dx_decode_bool(). However, having the refill loop between computation of the split values and the branch in vp8_decode_mb_tokens() is a big win on ARM (presumably due to memory latency and code size: refilling after normalization duplicates the code in the DECODE_AND_BRANCH_IF_ZERO and DECODE_AND_LOOP_IF_ZERO cases. Unfortunately, refilling at the end of vp8dx_bool_decoder_fill() and at the beginning of each decode step in vp8_decode_mb_tokens() means the latter requires an extra refill at the end. Platform-specific versions could avoid the problem, but would require most of detokenize.c to be duplicated. Change-Id: I16c782a63376f2a15b78f8086d899b987204c1c7
-
- 15 Jun, 2010 - 1 commit
-
-
Yunqing Wang authored
Add same fix in subpixel_sse2.asm. Change-Id: Icfda6103cbf74ec43308e96961dd738aa823c14d
-
- 14 Jun, 2010 - 3 commits
-
-
John Koleszar authored
Change-Id: I7b35f4717cdd204224112f72471b551617262417
-
Guillermo Ballester Valor authored
Change-Id: I2a97f08cc3c7808ce5be39e910cc5147ecf03a1d
-
Scott LaVarnway authored
Added sse2 version of vp8_regular_quantize_b which improved encode performance(for the clip used) by ~10% for 32 bit builds and ~3% for 64 bit builds. Also updated SHADOW_ARGS_TO_STACK to allow for more than 9 arguments. Change-Id: I62f78eabc8040b39f3ffdf21be175811e96b39af
-
- 12 Jun, 2010 - 1 commit
-
-
John Koleszar authored
This patch addresses issue #79, which is a regression since commit 28de670c "Fix RD bug." If the coded error value is zero, the iiratio calculation effectively multiplies by 1000000 by the DOUBLE_DIVIDE_CHECK macro. This can result in a value larger than INT_MAX, giving a negative ratio. Since the error values are conceptually unsigned (though they're stored in a double) this patch makes the iiratio values unsigned, which allows the clamping to work as expected.
-
- 11 Jun, 2010 - 7 commits
-
-
John Koleszar authored
Typo caused C version of 16x16x4 SAD to be called when built with --disable-runtime-cpu-detect. Change-Id: I0fe6fa67280b3a5f13acb3c8ed914f039aaaf316
-
John Koleszar authored
ssim.c comiles in a huge (512M) amount of global scratch space. Allocating this data on the heap would be a better solution, but this file doesn't need to be built at all in most cases, so as a first pass, disable it except when doing opsnr.stt output (--enable-psnr). Change-Id: I320d812f6d652a12516a16b52295ebff20b5bd42
-
Makoto Kato authored
XMM6 to XMM15 are non-volatile on Windows x64 ABI. We have to save these registers. Change-Id: I4676309f1350af25c8a35f0c81b1f0499ab99076
-
Paul Wilkins authored
(Thanks to Ronald S. Bultje)
-
Paul Wilkins authored
-
Paul Wilkins authored
low and high Q ends.
-
John Koleszar authored
No good reason to be tricky here. I don't know why 'break' occurred to me as the natrual replacement for the 'return', but an if/else block is definitely clearer. Change-Id: I08a336307afeb0dc7efa494b37398f239f66c2cf
-
- 10 Jun, 2010 - 4 commits
-
-
Timothy B. Terriberry authored
The new scheme introduced in I68d35a2f did not clamp chroma MVs in the SPLITMV case, and clamped them incorrectly (to the luma plane bounds) in every other case. Because chroma MVs are computed from the luma MVs before clamping occurs, they could still point outside of the frame buffer and cause crashes. This clamping happens outside of the MV prediction loop, and so should not affect bitstream decoding.
-
John Koleszar authored
Vestigial. Change-Id: Iffa9e6d5ba5199b136d7549890101da17c11e3c3
-
Yunqing Wang authored
Restructure vp8_sixtap_predict functions to eliminate extra 5-line calculation while doing first-pass only. Also, combline functions to eliminate usage of intermediate buffer. This gives decoder a 3% performance gain on my test clips. Change-Id: I13de49638884d1a57d0855c63aea719316d08c1b
-
Paul Wilkins authored
-
- 09 Jun, 2010 - 1 commit
-
-
John Koleszar authored
This patch removes the secondary MV clamping from the MV decoder. This behavior was consistent with limits placed on non-split MVs by the reference encoder, but was inconsistent with the MVs generated in the split case. The purpose of this secondary clamping was only to prevent crashes on invalid data. It was not intended to be a behaviour an encoder could or should rely on. Instead of doing additional clamping in a way that changes the entropy context, the secondary clamp is removed and the border handling is made implmentation specific. With respect to the spec, the border is treated as essentially infinite, limited only by the clamping performed on the near/nearest reference and the maximum encodable magnitude of the residual MV. This does not affect any currently produced streams. Change-Id: I68d35a2fbb51570d6569eab4ad233961405230a3
-
- 08 Jun, 2010 - 3 commits
-
-
Yaowu Xu authored
Change-Id: I7ccc580410bea096a70dce0cc3d455348d4287c5
-
Yaowu Xu authored
Change-Id: I180a05ad57ee6164a6a169ee08e8affd09671eee
-
Paul Wilkins authored
-
- 07 Jun, 2010 - 3 commits
-
-
Paul Wilkins authored
-
Philip Jägenstedt authored
-
Yaowu Xu authored
Change-Id: I944035e720ef834561a9da0d723879a4f787312c
-
- 05 Jun, 2010 - 1 commit
-
-
John Koleszar authored
This patch adds support for building shared libraries when configured with the --enable-shared switch. Building DLLs would require more invasive changes to the sample utilities than I want to make in this patch, since on Windows you can't use the address of an imported symbol in a static initializer. The best way to work around this is proably to build the codec interface mapping table with an init() function, but dll support is of questionable value anyway, since most windows users will probably use a media framework lib like webmdshow, which links this library in staticly. Change-Id: Iafb48900549b0c6b67f4a05d3b790b2643d026f4
-
- 04 Jun, 2010 - 1 commit
-
-
John Koleszar authored
Change-Id: Ieebea089095d9073b3a94932791099f614ce120c
-
- 01 Jun, 2010 - 1 commit
-
-
Yunqing Wang authored
Tests on x86 showed this function costed 2.7% of total decoding time because of all the memory reads/writes. After modification, it only costs about 0.7% of decoding time, which gives a 2% gain. Change-Id: I5003ee30b6dc6dea0bfa42a6ad7e7c22fcc7b215
-
- 30 May, 2010 - 1 commit
-
-
Yaowu Xu authored
The intra prediction needs one line above at the top edge.
-
- 28 May, 2010 - 3 commits
-
-
Luca Barbato authored
it is used by vp8/encoder/onyx_if.c fixes: vp8/encoder/onyx_if.c:5189: warning: implicit declaration of function ‘vp8_deblock’
-
Yaowu Xu authored
Change-ID: I093abe6094589a0d73f6ca85b825678a19e68285
-
Yaowu Xu authored
This is to accommodate output packets for both compressed data and psnr stats. For each frame, there are at least one packet for compressed data and one for psnr stats. For a max lag of 25, 64 is large enough to cover all lagged frames at the end of encoding. Change-Id: If20787fbc86f96e1aa16a3ccf2adc93e6c1e3d5f
-
- 25 May, 2010 - 4 commits
-
-
John Koleszar authored
This file was moved to vpx/, currently this reference breaks the MSVS build. Change-Id: I2c90a7a1c09cb66055e3daf84facefcaee1085a1
-
Paul Wilkins authored
is constructed from multiple source frames Change-Id: I2e026c10d02b071b401c9fe8ab8dcfc0ac306103
-
John Koleszar authored
The PLANE_{PACKED,Y,U,V,ALPHA} macros should be renamed to be within the VPX_ namespace. Fixes #27
-
John Koleszar authored
This renames the vpx_codec/ directory to vpx/, to allow applications to more consistently reference these includes with the vpx/ prefix. This allows the includes to be installed in /usr/local/include/vpx rather than polluting the system includes directory with an excessive number of includes. Change-Id: I7b0652a20543d93f38f421c60b0bbccde4d61b4f
-
- 24 May, 2010 - 1 commit
-
-
Yunqing Wang authored
-
- 21 May, 2010 - 2 commits
-
-
James Zern authored
Avoid an potential name clashes and match other external types. s/IMG_FMT/VPX_$&/g s/img_fmt/vpx_$&/g Change-Id: Ia7ad5bbb6424416b37e71e5f5eb1eca31c3c707f
-
John Koleszar authored
This doesn't play well with autotools, and the preprocessor magic is confusing and unhelpful in the vp8-only context. Change-Id: I2fcb57e6eb7876ecb58509da608dc21f26077ff1
-
- 20 May, 2010 - 1 commit
-
-
Paul Wilkins authored
-
- 19 May, 2010 - 1 commit
-
-
Yaowu Xu authored
Visual c++ compiler uses xmm registers for floating point operations for 64 bit architecture, therefore its calling convention requires the preservation of xmm6-xmm15 in any function that have used these registers. However, the sse2 functions, that were originally written for 32 bit windows, may have used xmm6 and xmm7 without preserving the content. In this particular case, the compiler used xmm6 to save the variable "two_pass_min_rate", the value of the variable is mucked up by our sse2 optimized loop filter functions, hence the results of release/debug mismatching.
-