- 24 Jun, 2013 - 3 commits
-
-
Ronald S. Bultje authored
Instead, just allocate a few bytes on the stack, this is 4k, which isn't all that much. Change-Id: I82af6ee89e6ed01faaa23ff891ee7ced76df8c16
-
Yaowu Xu authored
-
John Koleszar authored
For cases where there's no transform set in bit 0 (the left edge of the SB) but bit 0 of mask_4x4_int is set (the edge 4 pixels from the left edge needs filtering), it was incorrectly being skipped before. This situation only happens on the leftmost edge of the image, as the edge at column 0 is intentionally skipped since there aren't pixels to the left to read. Change-Id: Ib2fbbcb40166e90af31b1a0e13b85b68c226cbd3
-
- 22 Jun, 2013 - 2 commits
-
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
Fixes crashes of test_libvpx on 32-bit Linux. Change-Id: If94e7628a86b788ca26c004861dee2f162e47ed6
-
- 21 Jun, 2013 - 14 commits
-
-
John Koleszar authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
Change-Id: I8fcab81e390f93dc17e9666bbf8f77883b5aa897
-
James Zern authored
Change-Id: Id54ad9a781634f075e990d5bade5be8490959975
-
Ronald S. Bultje authored
Fixes a crash on Windows when building with MSVC. Change-Id: I124ac756a1be55d190fadda5fcc46d23b1445dbf
-
Ronald S. Bultje authored
Change vp9_block_error() to return a 64bit error variable, change all callers to expect a 64bit return value (this will prevent overflows, which we basically don't check for at all right now). Remove duplicate block_error() function, which fixed that through truncation. Remove old (incompatible) mmx/sse2 block_error SIMD versions and replace with a new one that returns a 64bit value. Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to 3min23, i.e. a 3% overall speedup. Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
3% faster overall (3min35.0 to 3min28.5). Change-Id: I5ff8a5c2c91586b6632ca5009ad1ea51ce94af5e
-
Yaowu Xu authored
-
Yaowu Xu authored
and remove unused code. Change-Id: If380440c4450294b5450b7a9eeb94a376846ec01
-
Yaowu Xu authored
-
Yaowu Xu authored
Change-Id: I7960178c95c54d5c4497e44cfc8c493566294b34
-
Yaowu Xu authored
-
- 20 Jun, 2013 - 21 commits
-
-
Ronald S. Bultje authored
Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to 3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions which use a bilinear filter (x_offset & 7 || y_offset & 7) aren't perfectly interleaved, and can probably be improved further in the future. I've marked this with a few TODOs/FIXMEs in the code. Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9
-
Jim Bankoski authored
-
Jim Bankoski authored
Change-Id: Idfd69e66e8982275eb00d8007a55efd1a4f86a98
-
James Zern authored
-
Frank Galligan authored
- size_t vs int. Change-Id: Ib47ebd932a4b69db9f52a43000bb69d0a96b9134
-
James Zern authored
This reverts commit 90a9900a Seems to break the Mac build: src/include/gtest/internal/gtest-port.h:1208:: pthread_mutex_lock(&mutex_)failed with error 22 Abort trap: 6 Change-Id: Icbe31161d7c27f1b0a28d33409e7712430bbf0ae
-
Jingning Han authored
-
Johann authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Deb Mukherjee authored
Improves the rd modeling function and implements them using interpolation from a table which is a little faster. Also uses sse as input to the modeling function rather than var - since there is no dc prediction used and as a result the sse works a little better. derfraw300: +0.05% Speedup: ~1% Change-Id: I151353c6451e0e8fe3ae18ab9842f8f67e5151ff
-
Johann authored
dboolhuff.c(50) : warning C4267: 'initializing' : conversion from 'size_t' to 'int' Change-Id: I6b85759efb2fa19f362f406623d8a7583a55c036
-
Jim Bankoski authored
adds a new speed feature to force partitioning to be greater than or less than a certain size Change-Id: I8c048eeeef93700ae822eccf98f8751a45b2e7d0
-
Jim Bankoski authored
this feature lets you set a partitioning size to be used by the entire frame. Change-Id: I208a4c8c701375cbb054418266f677768b6f8f06
-
Jim Bankoski authored
This uses variance to split partition. Variance is calculated using nearest mv, always from last ref frame. Change-Id: Idd015b4a9aa3bc82591759eac239680c07496896
-
Jim Bankoski authored
Change-Id: Ie24489a4d39f3e53e816eeebf75a1c9c7d94515a
-
Jim Bankoski authored
Change-Id: Ideee45cad8b38087c509cd404484728e85d0c427
-
Jim Bankoski authored
This uses the speed feature functionality for code. Change-Id: I9cd16c0c5f98520ae27ebba81aa2c178546587f8
-
Jim Bankoski authored
force us to go through slow partitioning for keyframes, altref and overlays. Change-Id: I1a286361bf74083e71973575a7296be46eb98742
-
Ronald S. Bultje authored
Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 -> 3min58). Specific changes to timings for each function compared to original assembly-optimized versions (or just new version timings if no previous assembly-optimized version was available): sse2 4x4: 99 -> 82 cycles sse2 4x8: 128 cycles sse2 8x4: 121 cycles sse2 8x8: 149 -> 129 cycles sse2 8x16: 235 -> 245 cycles (?) sse2 16x8: 269 -> 203 cycles sse2 16x16: 441 -> 349 cycles sse2 16x32: 641 cycles sse2 32x16: 643 cycles sse2 32x32: 1733 -> 1154 cycles sse2 32x64: 2247 cycles sse2 64x32: 2323 cycles sse2 64x64: 6984 -> 4442 cycles ssse3 4x4: 100 cycles (?) ssse3 4x8: 103 cycles ssse3 8x4: 71 cycles ssse3 8x8: 147 cycles ssse3 8x16: 158 cycles ssse3 16x8: 188 -> 162 cycles ssse3 16x16: 316 -> 273 cycles ssse3 16x32: 535 cycles ssse3 32x16: 564 cycles ssse3 32x32: 973 cycles ssse3 32x64: 1930 cycles ssse3 64x32: 1922 cycles ssse3 64x64: 3760 cycles Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d
-
Jim Bankoski authored
need to rework these Change-Id: I17dc2c88d2faadd2f8fb117c52c25f04ea2e9856
-