Commits · 4dc70fa7f9a47096b5062f7d0600aafe4878dec4 · BC / public / external / libvpx

24 Jun, 2013 - 3 commits

Don't re-allocate comp_pred buffers for each call to comp motion search. · 4dc70fa7

Ronald S. Bultje authored 11 years ago

Instead, just allocate a few bytes on the stack, this is 4k, which isn't
all that much.

Change-Id: I82af6ee89e6ed01faaa23ff891ee7ced76df8c16

4dc70fa7

Merge "Fix loopfilter of leftmost 4x4 edges in SB" · 93f88ab5
Yaowu Xu authored 11 years ago

93f88ab5

Fix loopfilter of leftmost 4x4 edges in SB · 858475a0

John Koleszar authored 11 years ago

For cases where there's no transform set in bit 0 (the left edge of
the SB) but bit 0 of mask_4x4_int is set (the edge 4 pixels from the
left edge needs filtering), it was incorrectly being skipped before.
This situation only happens on the leftmost edge of the image, as
the edge at column 0 is intentionally skipped since there aren't
pixels to the left to read.

Change-Id: Ib2fbbcb40166e90af31b1a0e13b85b68c226cbd3

858475a0

22 Jun, 2013 - 2 commits
- Merge "Allocate memory using appropriate expected alignment in unit tests." · 4eb8c565
  Ronald S. Bultje authored 11 years ago
  
  4eb8c565
- Allocate memory using appropriate expected alignment in unit tests. · ac6ea2ab
  Ronald S. Bultje authored 11 years ago
```
Fixes crashes of test_libvpx on 32-bit Linux.

Change-Id: If94e7628a86b788ca26c004861dee2f162e47ed6
```
  ac6ea2ab
21 Jun, 2013 - 14 commits
- Merge "Add some unaligned test vectors" · 0c8e13d2
  John Koleszar authored 11 years ago
  
  0c8e13d2
- Merge "Remove emms - that shouldn't be there." · 98188e0e
  Ronald S. Bultje authored 11 years ago
  
  98188e0e
- Remove emms - that shouldn't be there. · fc033b38
  Ronald S. Bultje authored 11 years ago
```
Change-Id: I8fcab81e390f93dc17e9666bbf8f77883b5aa897
```
  fc033b38
- variance_test: use REGISTER_STATE_CHECK · cc774c8b
  James Zern authored 11 years ago
```
Change-Id: Id54ad9a781634f075e990d5bade5be8490959975
```
  cc774c8b
- Add missing SECTION .text marker in assembly file. · ba42c026
  Ronald S. Bultje authored 11 years ago
```
Fixes a crash on Windows when building with MSVC.

Change-Id: I124ac756a1be55d190fadda5fcc46d23b1445dbf
```
  ba42c026
- Implement SSE2 block_error. · 54b2a596
  Ronald S. Bultje authored 11 years ago
```
Change vp9_block_error() to return a 64bit error variable, change all
callers to expect a 64bit return value (this will prevent overflows,
which we basically don't check for at all right now). Remove duplicate
block_error() function, which fixed that through truncation. Remove
old (incompatible) mmx/sse2 block_error SIMD versions and replace with
a new one that returns a 64bit value.

Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to
3min23, i.e. a 3% overall speedup.

Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68
```
  54b2a596
- Merge "Add subtract_block SSE2 version and unit test." · 7756e989
  Ronald S. Bultje authored 11 years ago
  
  7756e989
- Merge "SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance()." · 9a480482
  Ronald S. Bultje authored 11 years ago
  
  9a480482
- Add subtract_block SSE2 version and unit test. · 25c588b1
  Ronald S. Bultje authored 11 years ago
```
3% faster overall (3min35.0 to 3min28.5).

Change-Id: I5ff8a5c2c91586b6632ca5009ad1ea51ce94af5e
```
  25c588b1
- Merge "Get some speed back for cpuused 1" · 869d7706
  Yaowu Xu authored 11 years ago
  
  869d7706
- Get some speed back for cpuused 1 · 45e25a78
  Yaowu Xu authored 11 years ago
```
and remove unused code.

Change-Id: If380440c4450294b5450b7a9eeb94a376846ec01
```
  45e25a78
- Merge "rename variables to avoid build error in MSVC" · 61721181
  Yaowu Xu authored 11 years ago
  
  61721181
- rename variables to avoid build error in MSVC · ee07a261
  Yaowu Xu authored 11 years ago
```
Change-Id: I7960178c95c54d5c4497e44cfc8c493566294b34
```
  ee07a261
- Merge "Implement sse2 and ssse3 versions for all sub_pixel_variance sizes." · e6cd5ed3
  Yaowu Xu authored 11 years ago
  
  e6cd5ed3
20 Jun, 2013 - 21 commits

SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance(). · 1e6a32f1

Ronald S. Bultje authored 11 years ago

Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to
3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions
which use a bilinear filter (x_offset & 7 || y_offset & 7) aren't
perfectly interleaved, and can probably be improved further in the
future. I've marked this with a few TODOs/FIXMEs in the code.

Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9

1e6a32f1

Merge "clean out libvpx-srcs.txt if built" · 84490a1f
Jim Bankoski authored 11 years ago

84490a1f
clean out libvpx-srcs.txt if built · 975df8c7
Jim Bankoski authored 11 years ago
```
Change-Id: Idfd69e66e8982275eb00d8007a55efd1a4f86a98
```
975df8c7
Merge "Revert "test_libvpx: disable pthreads in gtest"" · 43d04ef9
James Zern authored 11 years ago

43d04ef9
Fix win64 warning. · c259af4f
Frank Galligan authored 11 years ago
```
- size_t vs int.

Change-Id: Ib47ebd932a4b69db9f52a43000bb69d0a96b9134
```
c259af4f

Revert "test_libvpx: disable pthreads in gtest" · f2dc3825

James Zern authored 11 years ago

This reverts commit 90a9900a

Seems to break the Mac build:
src/include/gtest/internal/gtest-port.h:1208:: pthread_mutex_lock(&mutex_)failed with error 22
Abort trap: 6

Change-Id: Icbe31161d7c27f1b0a28d33409e7712430bbf0ae

f2dc3825

Merge "Add unit tests for 4x4 ADST" · 4f4713b4
Jingning Han authored 11 years ago

4f4713b4
Merge "Cast value to avoid size_t/int warning on win64" · 0373e517
Johann authored 11 years ago

0373e517
Merge "Renaming 'nmv' to 'mv' for several functions." · 8283d893
Dmitry Kovalev authored 11 years ago

8283d893
Merge "Function decomposition inside vp9_decodemv.c file." · 77186ee6
Dmitry Kovalev authored 11 years ago

77186ee6

Improving model rd with variance and quant step · 7947a33d

Deb Mukherjee authored 11 years ago

Improves the rd modeling function and implements them using interpolation
from a table which is a little faster. Also uses sse as input to the
modeling function rather than var - since there is no dc prediction
used and as a result the sse works a little better.

derfraw300: +0.05%
Speedup: ~1%

Change-Id: I151353c6451e0e8fe3ae18ab9842f8f67e5151ff

7947a33d

Cast value to avoid size_t/int warning on win64 · d94aee68

Johann authored 11 years ago

dboolhuff.c(50) : warning C4267: 'initializing' : conversion from
'size_t' to 'int'

Change-Id: I6b85759efb2fa19f362f406623d8a7583a55c036

d94aee68

adds force partitioning greater than or less than block size · 9f2a1ae2

Jim Bankoski authored 11 years ago

adds a new speed feature to force partitioning to be greater than
or less than a certain size

Change-Id: I8c048eeeef93700ae822eccf98f8751a45b2e7d0

9f2a1ae2

adds a set partitioning to speed features · 18bdf708

Jim Bankoski authored 11 years ago

this feature lets you set a partitioning size to be used by the entire
frame.

Change-Id: I208a4c8c701375cbb054418266f677768b6f8f06

18bdf708

partition by variance using var from last frame · 476d73d2

Jim Bankoski authored 11 years ago

This uses variance to split partition. Variance is calculated using
nearest mv,  always from last ref frame.

Change-Id: Idd015b4a9aa3bc82591759eac239680c07496896

476d73d2

convert all speed things to speed features · 1f94b976
Jim Bankoski authored 11 years ago
```
Change-Id: Ie24489a4d39f3e53e816eeebf75a1c9c7d94515a
```
1f94b976
new partition via variance · 727fa7b1
Jim Bankoski authored 11 years ago
```
Change-Id: Ideee45cad8b38087c509cd404484728e85d0c427
```
727fa7b1

fix to set up new speed feature · 0fad6a9d

Jim Bankoski authored 11 years ago

This uses the speed feature functionality for code.

Change-Id: I9cd16c0c5f98520ae27ebba81aa2c178546587f8

0fad6a9d

don't copy partitions for key frames or altrefs · df2314cf

Jim Bankoski authored 11 years ago

force us to go through slow partitioning for keyframes, altref and
overlays.

Change-Id: I1a286361bf74083e71973575a7296be46eb98742

df2314cf

Implement sse2 and ssse3 versions for all sub_pixel_variance sizes. · 8fb6c581

Ronald S. Bultje authored 11 years ago

Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 ->
3min58). Specific changes to timings for each function compared to
original assembly-optimized versions (or just new version timings if
no previous assembly-optimized version was available):

sse2   4x4:    99 ->   82 cycles
sse2   4x8:           128 cycles
sse2   8x4:           121 cycles
sse2   8x8:   149 ->  129 cycles
sse2   8x16:  235 ->  245 cycles (?)
sse2  16x8:   269 ->  203 cycles
sse2  16x16:  441 ->  349 cycles
sse2  16x32:          641 cycles
sse2  32x16:          643 cycles
sse2  32x32: 1733 -> 1154 cycles
sse2  32x64:         2247 cycles
sse2  64x32:         2323 cycles
sse2  64x64: 6984 -> 4442 cycles

ssse3  4x4:           100 cycles (?)
ssse3  4x8:           103 cycles
ssse3  8x4:            71 cycles
ssse3  8x8:           147 cycles
ssse3  8x16:          158 cycles
ssse3 16x8:   188 ->  162 cycles
ssse3 16x16:  316 ->  273 cycles
ssse3 16x32:          535 cycles
ssse3 32x16:          564 cycles
ssse3 32x32:          973 cycles
ssse3 32x64:         1930 cycles
ssse3 64x32:         1922 cycles
ssse3 64x64:         3760 cycles

Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d

8fb6c581

disable speed > 1 speed corrections in firstpass · f954490b
Jim Bankoski authored 11 years ago
```
need to rework these

Change-Id: I17dc2c88d2faadd2f8fb117c52c25f04ea2e9856
```
f954490b