Commits · 29731308c4667de4fe4f02f92f0c2b29af86bbc1 · BC / public / external / libvpx

07 Feb, 2013 - 3 commits

Added skip switches for SB32 and SB64 · 29731308

Paul Wilkins authored 12 years ago

Added switches and code to skip/breakout from
doing SB32 and SB64 tests based on whether
the 16x16 MB tests used split modes. Also to
optionally skip 64x64 if 16x16 was chosen over
32x32.

Impact varies depending on clip from a few %
up to almost 50% on encode speed. Only the
split mode breakout is currently enabled.

Change-Id: Ib5836140b064b350ffa3057778ed2cadcc495cf8

29731308

Use fdct8x4 instead of fdct4x4 where the block size allows it. · 5cfd82bc

Ronald S. Bultje authored 12 years ago

This allows for faster SIMD implementations in the future (currently
there is no speed impact).

Change-Id: I732647e9148b5dcb44e6bc8728138f0141218329

5cfd82bc

Use configure checks for various inline keywords. · aac73df1
Ronald S. Bultje authored 12 years ago
```
Change-Id: I8508f1a3d3430f998bb9295f849e88e626a52a24
```
aac73df1

06 Feb, 2013 - 6 commits
- Add sse2 versions of sub_pixel_variance{32x32,64x64}. · a788e0fe
  Ronald S. Bultje authored 12 years ago
```
7.5% faster overall encoding.

Change-Id: Ie9bb7f9fdf93659eda106404cb342525df1ba02f
```
  a788e0fe
- Merge "Reindent segmentation code." into experimental · a001fe97
  Ronald S. Bultje authored 12 years ago
  
  a001fe97
- Reindent segmentation code. · 55cafb61
  Ronald S. Bultje authored 12 years ago
```
Indentation was off by 2 spaces for this particular block.

Change-Id: I1e587b7ad3eff77ade5521252d20c7bb2daa0f6d
```
  55cafb61
- Eliminate tautology · 31cbe2ed
  John Koleszar authored 12 years ago
```
      Unreachable code
  that does nothing anyway
      removed forever.

Change-Id: I14105d2dd9dbc9d558f36464055e350dbeb45488
```
  31cbe2ed
- Merge "Change definition of NearestMV." into experimental · 8b4e9c59
  Paul Wilkins authored 12 years ago
  
  8b4e9c59
- Fix mismatch after merge of the tiling patch. · 278df745
  Ronald S. Bultje authored 12 years ago
```
Change-Id: I8ecc178b4d4069e721c7fec6d7631c00e4a3e5d5
```
  278df745
05 Feb, 2013 - 11 commits

[WIP] Add column-based tiling. · 1407bdc2

Ronald S. Bultje authored 12 years ago

This patch adds column-based tiling. The idea is to make each tile
independently decodable (after reading the common frame header) and
also independendly encodable (minus within-frame cost adjustments in
the RD loop) to speed-up hardware & software en/decoders if they used
multi-threading. Column-based tiling has the added advantage (over
other tiling methods) that it minimizes realtime use-case latency,
since all threads can start encoding data as soon as the first SB-row
worth of data is available to the encoder.

There is some test code that does random tile ordering in the decoder,
to confirm that each tile is indeed independently decodable from other
tiles in the same frame. At tile edges, all contexts assume default
values (i.e. 0, 0 motion vector, no coefficients, DC intra4x4 mode),
and motion vector search and ordering do not cross tiles in the same
frame.
t log

Tile independence is not maintained between frames ATM, i.e. tile 0 of
frame 1 is free to use motion vect...

1407bdc2

Merge "Add SSE3 versions for sad{32x32,64x64}x4d functions." into experimental · 82286413
Ronald S. Bultje authored 12 years ago

82286413
Merge "fix a build issue with MSVC on windows" into experimental · 9e3e7439
Yaowu Xu authored 12 years ago

9e3e7439
Merge "rewrite 4x4 idct and fdct" into experimental · c9ae73b2
Yaowu Xu authored 12 years ago

c9ae73b2
Add SSE3 versions for sad{32x32,64x64}x4d functions. · 58c983d1
Ronald S. Bultje authored 12 years ago
```
Overall encoding about 15% faster.

Change-Id: I176a775c704317509e32eee83739721804120ff2
```
58c983d1
fix a build issue with MSVC on windows · 77f889b2
Yaowu Xu authored 12 years ago
```
for idct 16x16 unit test

Change-Id: I51da9405c3a4d7bb3f4cdf062aaccaa90b33dca4
```
77f889b2

rewrite 4x4 idct and fdct · fa36981e

Yaowu Xu authored 12 years ago

This commit changes the 4x4 iDCT to use same algorithm & constants as
other iDCTs. The 4x4 fDCT is also changed to be based on the new iDCT.

Change-Id: Ib1a902693228af903862e1f5a08078c36f2089b0

fa36981e

Change definition of NearestMV. · 81043e8d

Paul Wilkins authored 12 years ago

This commit makes the NearestMV match the chosen
best reference MV. It can be a 0,0 or non zero vector
which means the the compound nearest mv mode can
combine a 0,0 and a non zero vector.

Change-Id: I2213d09996ae2916e53e6458d7d110350dcffd7a

81043e8d

Merge "Added vp9_short_idct1_32x32_c" into experimental · 77440d50
Scott LaVarnway authored 12 years ago

77440d50
Merge "Re-factor code for rd thresholds." into experimental · fb4b533d
Paul Wilkins authored 12 years ago

fb4b533d

Added vp9_short_idct1_32x32_c · 5780c4cb

Scott LaVarnway authored 12 years ago

and called this function in vp9_dequant_idct_add_32x32_c when
eob == 1.  For the test clip used, the decoder performance improved
by 21+%.  Based on Yaowu's 16 point idct work.

Change-Id: Ib579a90fed531d45777980e04bf0c9b23c093c43

5780c4cb

04 Feb, 2013 - 5 commits

Re-factor code for rd thresholds. · 3ab53876

Paul Wilkins authored 12 years ago

Separate out code to set the main encode speed
related rd thresholds. Some values changed from
the initial defaults for various new modes.

Quality test results pending but even the addition
of some further non-zero defaults helps encode speed
somewhat in limited testing on derf clips.

Adjustment of thresholds for quality / speed tradeoff
to follow.

Change-Id: I117ee473157e151a1b93193d5f393449328de20d

3ab53876

Added INT16_MIN and INT16_MAX for MSVC builds · dea14332

Yaowu Xu authored 12 years ago

These macros were not defined in earlier version of MSVC

Change-Id: I8270a3abb7c6e9ead1931a653d7e41f877a1017b

dea14332

enable 16x16 iDCT unit test · ebd58089

Yaowu Xu authored 12 years ago

test for forward transform will be enabled later after re-do forward
transform

Change-Id: Ie7c7cf88baf7ecbebbe52fe027e1c3b33d3b9d49

ebd58089

re-write 8 point idct · 1eb79dc1

Yaowu Xu authored 12 years ago

to be consistent with idct16 and idct32.

Change-Id: Ie89dbd32b65c33274b7fecb4b41160fcf1962204

1eb79dc1

a couple of minor fixes · ccaaeb4b

Yaowu Xu authored 12 years ago

fixed a function prototypes to prevent compiler warnings;
removed a function not in use;
un-capitialize "Refstride" to ref_stride

Change-Id: Ib4472b6084f357d96328c6a06e795b6813a9edba

ccaaeb4b

01 Feb, 2013 - 3 commits

Merge "Changes 16 point idct" into experimental · af4c9d2f
Yaowu Xu authored 12 years ago

af4c9d2f
Merge "fix a small bug in 16 point forward dct" into experimental · c1f611be
Yaowu Xu authored 12 years ago

c1f611be

Changes 16 point idct · 91e0e801

Yaowu Xu authored 12 years ago

This commit changes the inverse 16 point dct to use the same algorithm
as the one for 32 point idct. In fact, now 16 point dct uses the exact
version of the souce code for even portion of the 32 point idct.

Tests showed current implementation has significant better accuracy
than the previous version. With this implementation and the minor bug
fix on forward 16 point dct, encoding tests showed about 0.2% better
compression of CIF set, test results on std-hd setting pending.

Change-Id: I68224b60c816ba03434e9f08bee147c7e344fb63

91e0e801

31 Jan, 2013 - 3 commits

fix a small bug in 16 point forward dct · ab1cad9b

Yaowu Xu authored 12 years ago

The commit fixes a minor error in 16 point fdct where in a rotation can
produce result of -1 instead of 0.

Change-Id: I45aac4a52bcd06225c6d04e643547a13e1c1aade

ab1cad9b

Merge "A fix point implementation of 32x32 idct" into experimental · c94e55ad
Yaowu Xu authored 12 years ago

c94e55ad

A fix point implementation of 32x32 idct · 5149d7f7

Yaowu Xu authored 12 years ago

This commit changes the 32x32 idct to use integer only. The algorithm
was taken directly from "A Fast Computational Algorithm for the
Discrete Cosine Tranform" by W. Chen, et al., which was published in
IEEE Transaction on Communication Vol. Com.-25 No. 9, 1977. The signal
flow graph in the original paper is for a 32 point forward dct, the
current implementation of inverse DCT was done by follow the graph in
reversed direction.

With this implementation, the 32 point inverse dct contains a 16 point
inverse dct in its even portion, similarly the 16 point idct further
contains 8 point and 4 point inverse dcts.

As of patch 4, encoding tests showed there is no compression loss when
compared against the floating point baseline. Numbers even showed very
small postives. (cif: .01%, std-hd: .05%).

Change-Id: I2d2d17a424b0b04b42422ef33ec53f5802b0f378

5149d7f7

30 Jan, 2013 - 9 commits
- Merge "Adding a frame parallel decoding mode" into experimental · a53be609
  Deb Mukherjee authored 12 years ago
  
  a53be609
- Merge "don't code the branch for the predicted seg_id if that flag is false." into experimental · b499c24c
  Ronald S. Bultje authored 12 years ago
  
  b499c24c
- don't code the branch for the predicted seg_id if that flag is false. · 3a4b18bc
  Ronald S. Bultje authored 12 years ago
```
Change-Id: Icb6e21dc0c2d9918faa33c8bf70943660df7ad88
```
  3a4b18bc
- Merge "Default superblock skip flag to 32x32 for skip-blocks." into experimental · 4d53a95a
  Ronald S. Bultje authored 12 years ago
  
  4d53a95a
- Merge "Reset skip flag in superblock RD loop." into experimental · de6718a3
  Ronald S. Bultje authored 12 years ago
  
  de6718a3
- Merge "Further improvement on compound inter-intra expt" into experimental · d2875053
  Deb Mukherjee authored 12 years ago
  
  d2875053
- Default superblock skip flag to 32x32 for skip-blocks. · 3febf970
  Ronald S. Bultje authored 12 years ago
```
This is identical to the later decisions made in encode_superblock().
This commit doesn't actually change anything, but makes the mbmi state
more consistent between the RD loop and the final encode result.

Change-Id: I9e735afb7c5a52e5b61728cb88c67ef9b9bf59be
```
  3febf970
- Reset skip flag in superblock RD loop. · b90996c5
  Ronald S. Bultje authored 12 years ago
```
This is the superblock equivalent of commit 290b83ab.

Change-Id: Ib3945dd9e992fa9ec1fdea5a11e17a3cc0e37637
```
  b90996c5
- Write only visible area (for better comparison with rec.yuv). · 2f6fce3e
  Ronald S. Bultje authored 12 years ago
```
Change-Id: I32bf4ee532a15af78619cbcd8a193224029fab50
```
  2f6fce3e