- 21 Jun, 2013 - 11 commits
-
-
Ronald S. Bultje authored
Change-Id: I8fcab81e390f93dc17e9666bbf8f77883b5aa897
-
Ronald S. Bultje authored
Fixes a crash on Windows when building with MSVC. Change-Id: I124ac756a1be55d190fadda5fcc46d23b1445dbf
-
Ronald S. Bultje authored
Change vp9_block_error() to return a 64bit error variable, change all callers to expect a 64bit return value (this will prevent overflows, which we basically don't check for at all right now). Remove duplicate block_error() function, which fixed that through truncation. Remove old (incompatible) mmx/sse2 block_error SIMD versions and replace with a new one that returns a 64bit value. Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to 3min23, i.e. a 3% overall speedup. Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
3% faster overall (3min35.0 to 3min28.5). Change-Id: I5ff8a5c2c91586b6632ca5009ad1ea51ce94af5e
-
Yaowu Xu authored
-
Yaowu Xu authored
and remove unused code. Change-Id: If380440c4450294b5450b7a9eeb94a376846ec01
-
Yaowu Xu authored
-
Yaowu Xu authored
Change-Id: I7960178c95c54d5c4497e44cfc8c493566294b34
-
Yaowu Xu authored
-
- 20 Jun, 2013 - 27 commits
-
-
Ronald S. Bultje authored
Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to 3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions which use a bilinear filter (x_offset & 7 || y_offset & 7) aren't perfectly interleaved, and can probably be improved further in the future. I've marked this with a few TODOs/FIXMEs in the code. Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9
-
Jim Bankoski authored
-
Jim Bankoski authored
Change-Id: Idfd69e66e8982275eb00d8007a55efd1a4f86a98
-
James Zern authored
-
Frank Galligan authored
- size_t vs int. Change-Id: Ib47ebd932a4b69db9f52a43000bb69d0a96b9134
-
James Zern authored
This reverts commit 90a9900a Seems to break the Mac build: src/include/gtest/internal/gtest-port.h:1208:: pthread_mutex_lock(&mutex_)failed with error 22 Abort trap: 6 Change-Id: Icbe31161d7c27f1b0a28d33409e7712430bbf0ae
-
Jingning Han authored
-
Johann authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Deb Mukherjee authored
Improves the rd modeling function and implements them using interpolation from a table which is a little faster. Also uses sse as input to the modeling function rather than var - since there is no dc prediction used and as a result the sse works a little better. derfraw300: +0.05% Speedup: ~1% Change-Id: I151353c6451e0e8fe3ae18ab9842f8f67e5151ff
-
Johann authored
dboolhuff.c(50) : warning C4267: 'initializing' : conversion from 'size_t' to 'int' Change-Id: I6b85759efb2fa19f362f406623d8a7583a55c036
-
Jim Bankoski authored
adds a new speed feature to force partitioning to be greater than or less than a certain size Change-Id: I8c048eeeef93700ae822eccf98f8751a45b2e7d0
-
Jim Bankoski authored
this feature lets you set a partitioning size to be used by the entire frame. Change-Id: I208a4c8c701375cbb054418266f677768b6f8f06
-
Jim Bankoski authored
This uses variance to split partition. Variance is calculated using nearest mv, always from last ref frame. Change-Id: Idd015b4a9aa3bc82591759eac239680c07496896
-
Jim Bankoski authored
Change-Id: Ie24489a4d39f3e53e816eeebf75a1c9c7d94515a
-
Jim Bankoski authored
Change-Id: Ideee45cad8b38087c509cd404484728e85d0c427
-
Jim Bankoski authored
This uses the speed feature functionality for code. Change-Id: I9cd16c0c5f98520ae27ebba81aa2c178546587f8
-
Jim Bankoski authored
force us to go through slow partitioning for keyframes, altref and overlays. Change-Id: I1a286361bf74083e71973575a7296be46eb98742
-
Ronald S. Bultje authored
Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 -> 3min58). Specific changes to timings for each function compared to original assembly-optimized versions (or just new version timings if no previous assembly-optimized version was available): sse2 4x4: 99 -> 82 cycles sse2 4x8: 128 cycles sse2 8x4: 121 cycles sse2 8x8: 149 -> 129 cycles sse2 8x16: 235 -> 245 cycles (?) sse2 16x8: 269 -> 203 cycles sse2 16x16: 441 -> 349 cycles sse2 16x32: 641 cycles sse2 32x16: 643 cycles sse2 32x32: 1733 -> 1154 cycles sse2 32x64: 2247 cycles sse2 64x32: 2323 cycles sse2 64x64: 6984 -> 4442 cycles ssse3 4x4: 100 cycles (?) ssse3 4x8: 103 cycles ssse3 8x4: 71 cycles ssse3 8x8: 147 cycles ssse3 8x16: 158 cycles ssse3 16x8: 188 -> 162 cycles ssse3 16x16: 316 -> 273 cycles ssse3 16x32: 535 cycles ssse3 32x16: 564 cycles ssse3 32x32: 973 cycles ssse3 32x64: 1930 cycles ssse3 64x32: 1922 cycles ssse3 64x64: 3760 cycles Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d
-
Jim Bankoski authored
need to rework these Change-Id: I17dc2c88d2faadd2f8fb117c52c25f04ea2e9856
-
Jim Bankoski authored
The new print out includes skips and has prefixed sections so you can grep to find things like transforms chosen on each frame. Change-Id: I195043424647d9514cfc3ff6720a5b20d010fa1b
-
Jim Bankoski authored
-
Jim Bankoski authored
Change-Id: I26e80ede80cb4389378a95afa95d229092a9859a
-
Jingning Han authored
Enable sign bias check and round-trip error unit tests for 4x4 hybrid transform modules. Change-Id: Icd3d839f098d4b92b00ff76eac146765b039d0d3
-
John Koleszar authored
-
Yaowu Xu authored
Since intra block decoding is handled by decode_sb_intra() separately. Change-Id: I42d757884714084c92fc23ec5d35d4dc946f4b15
-
- 19 Jun, 2013 - 2 commits
-
-
Dmitry Kovalev authored
Change-Id: Iab96e6a50aec543c63e15cd134f9d5f01ca7ceff
-
James Zern authored
currently threading is internal to libvpx so thread safety is unneeded in libgtest -- visual studio builds already operate in this way as they do not have pthread.h available by default. this removes an unconditional link to libpthread using $(extralibs) should libvpx require it. Change-Id: Ieae1d693406653a54b54fba818c598836797d33b
-