Commits · 83c7e13a6bcd1535d9547ef3c89816bf993b458b · BC / public / external / libvpx

16 Jul, 2013 - 1 commit

Inline vp9_quantize() in xform_quant(). · 1ff94fea

Ronald S. Bultje authored 11 years ago

Cycle times:
4x4:    151 to  131 cycles (15% faster)
8x8:    334 to  306 cycles (9% faster)
16x16: 1401 to 1368 cycles (2.5% faster)
32x32: 7403 to 7367 cycles (0.5% faster)

Total encode time of first 50 frames of bus @ 1500kbps (speed 0)
goes from 1min39.2 to 1min38.6, i.e. a 0.67% overall speedup.

Change-Id: I799a49460e5e3fcab01725564dd49c629bfe935f

1ff94fea

11 Jul, 2013 - 1 commit

Moving segmentation related vars into separate struct. · c4ad3273

Dmitry Kovalev authored 11 years ago

Adding segmentation struct to vp9_seg_common.h. Struct members are from
macroblockd and VP9Common structs. Moving segmentation related constants
and enums to vp9_seg_common.h.

Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03

c4ad3273

01 Jul, 2013 - 2 commits

Update quantize SSSE3 SIMD to cover 32x32 transform case also. · c8defcfd

Ronald S. Bultje authored 11 years ago

Encode time of bus (speed 0) 50 frames @ 1500kbps goes from 2min14.4 to
2min10.1, i.e. a 2.3% overall speed increase.

Change-Id: I3699580e74ec26c7d24e03681bc47ba25ee1ee87

c8defcfd

Quantize (64-bit only, for now) SSSE3 SIMD. · 7353ceab

Ronald S. Bultje authored 11 years ago

Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps
goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is
x86-64 only, it needs some minor modifications to be 32bit compatible,
because it uses 15 xmm registers, whereas 32bit only has 8.

Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904

7353ceab

28 Jun, 2013 - 1 commit

Make coefficient skip condition an explicit RD choice. · af660715

Ronald S. Bultje authored 11 years ago

This commit replaces zrun_zbin_boost, a method of biasing non-zero
coefficients following runs of zero-coefficients to be rounded towards
zero, with an explicit skip-block choice in the RD loop.

The logic is basically that if individual coefficients should be rounded
towards zero (from a RD point of view), the trellis/optimize loop should
take care of it. If whole blocks should be zero (from a RD point of
view), a single RD check is much more efficient than a complete
serialization of the quantization loop.

Quality change: derf +0.5% psnr, +1.6% ssim; yt +0.6% psnr, +1.1% ssim.
SIMD for quantize will follow in a separate patch. Results for other
test sets pending.

Change-Id: Ife5fa641163ac5150ac428011e87188f1937c1f4

af660715

27 Jun, 2013 - 1 commit

Inline quantize so idiv instruction gets removed from inner loop. · 7a049be6

Ronald S. Bultje authored 11 years ago

Encoding time of first 50 frames of bus @ 1500kbps (speed 0) goes from
3min15.0 to 3min10.9, i.e. 2.1% faster overall.

Change-Id: If592ee99be09bcd34a7c8498347f44e7305e982c

7a049be6

19 Jun, 2013 - 1 commit

Add two-pass quantization · b5bf7b13

Yunqing Wang authored 11 years ago

Optimized the quantization function by making it a two-pass
process. The first pass does a quick checking of the transform
coefficients against the base ZBIN, and only keep the good
enough set of coefficients for quantization. A skipping
check is added. If all coefficients are within the base ZBIN, no
quantization is needed. The second pass is the actual quantization
pass, which only processes the coefficient subset determined
in first pass. This reduces the computation. Furthermore, an
alternitive method is used for large transform size, which often
has sparse nonzero quantized coefficients.

Overall, the encoder speedup is about 4%. The quantization function
itself gets 20% faster.

Change-Id: I3a9dd0da6db030260b6d9c314a9fa48ecae89f22

b5bf7b13

23 May, 2013 - 1 commit

Merge Scatter Scan experiment. · 33ecd6ad

Paul Wilkins authored 11 years ago

Removal from under configure flag.
A bit  renaming

Change-Id: I2213229dfe852001dfec16b149f47c52ce88f3aa

33ecd6ad

17 May, 2013 - 1 commit

Initial version of alpha channel support · 679e4abd

John Koleszar authored 11 years ago

This is a mostly-working implementation of an extra channel in the
bitstream. Configure with --enable-alpha to test. Notable TODOs:

 - Add extra channel to all mismatch tests, PSNR, SSIM, etc
 - Configurable subsampling
 - Variable number of planes (currently always uses all 4)
 - Loop filtering
 - Per-plane lossless quantizer
 - ARNR support

This implementation just uses the same contents as the Y channel
for the A channel, due to lack of content and general pain in
playing back 4 channel content. A later patch will use the actual
alpha channel passed in from outside the codec.

Change-Id: Ibf81f023b1c570bd84b3064e9b4b8ae52e087592

679e4abd

07 May, 2013 - 3 commits

Renaming Y1 and UV quant prefixes to y_ and uv_. · 1e7cf5d1
Dmitry Kovalev authored 11 years ago
```
Change-Id: If57e360c187a475fc90edb8c7170f498efcb31a5
```
1e7cf5d1

Adjust q range · 9afb6700

Paul Wilkins authored 11 years ago

Skip Q values between the q.0 mode and a real q of
2.0 as these are not valuable from an RD perspective.

Change-Id: I110c4858c57f97315953f4d88a2596d4764360df

9afb6700

Merge SB8X8 into the codebase · 776c1482

Jingning Han authored 11 years ago

Pull sb8x8 out of experimental list. verified via borg run tests.
Fixed unit test failures.

Change-Id: I12a4bbd17395930580c048ab68becad1ffe46e76

776c1482

03 May, 2013 - 1 commit

Separate transform and quant from vp9_encode_sb · 4529c68b

John Koleszar authored 11 years ago

This allows removing a large number of transform size specific functions,
as well as supporting 444/alpha by routing all code through the
subsampling-aware path.

Change-Id: Ieb085cebe9f37f24fc24de179898b22abfda08a4

4529c68b

02 May, 2013 - 1 commit

Create common vp9_encode_sb{,y} · 3f4e8063

John Koleszar authored 11 years ago

Creates a common encode (subtract, transform, quantize, optimize,
inverse transform, reconstruct) function for all sb sizes, including
the old 16x16 path.

Change-Id: I964dff1ea7a0a5c378046a069ad83495f54df007

3f4e8063

01 May, 2013 - 1 commit

Reduced y_dequant, uv_dequant size · 94ed11d8

Scott LaVarnway authored 11 years ago

Currently, only two values are used.  Removed the unused
values.

Change-Id: Idc5b8be354d84ffc68df39ea3e45f9f50d977b35

94ed11d8

30 Apr, 2013 - 2 commits

sb8x8 integration in rd loop. · d068d869

Ronald S. Bultje authored 11 years ago

Work-in-progress, not yet ready for review. TODO items:
- bitstream writing (encoder) and reading (decoder)
- decoder reconstruction

Change-Id: I5afb7284e7e0480847b47cd0097cb469433c9081

d068d869

Adding vp9_get_qindex function. · 3f6c6ffc

Dmitry Kovalev authored 11 years ago

Moving common code from encoder and decoder to vp9_get_qindex function.
Also moving quant-related constants from vp9_onyxc_int.h to
vp9_quant_common.h.

Change-Id: I70c5bfbaa1c8bf00fde0bfc459d077f88b6d46c8

3f6c6ffc

26 Apr, 2013 - 1 commit
- Consistent names for quant-related functions and variables. · 5a5a1f25
  Dmitry Kovalev authored 11 years ago
```
Change-Id: I3a6d601e90e8740b9c26dd0afbfe9d467b75d367
```
  5a5a1f25
25 Apr, 2013 - 3 commits

quantize: make 4x4, 8x8 common with larger transforms · a672351a

John Koleszar authored 11 years ago

There were 4 variants of the quantize loop in vp9_quantize.c, now
there is 1.

Change-Id: Ic853393411214b32d46a6ba53769413bd14e1cac

a672351a

Use b_width/height_log2 instead of mb_ where appropriate. · c849eaca

Ronald S. Bultje authored 11 years ago

Basic assumption: when talking about transform units, use b_; when
talking about macroblock indices, use mb_.

Change-Id: Ifd163f595d4924ff892de4eb0401ccd56dc81884

c849eaca

Move dequant from BLOCKD to per-plane MACROBLOCKD · 15255eef

John Koleszar authored 11 years ago

This data can vary per-plane, but not per-block.

Change-Id: I1971b0b2c2e697d2118e38b54ef446e52f63c65a

15255eef

24 Apr, 2013 - 2 commits

Move skip_block from BLOCK to MACROBLOCK · c7c98a7f

John Koleszar authored 11 years ago

This data is fixed at the MB level, so move it to the common part
of MACROBLOCK.

Change-Id: Idd8c87118e501cdf0a202bd84c28b502a8234edf

c7c98a7f

Move quantizer data from BLOCK to MACROBLOCK · 5c649f67

John Koleszar authored 11 years ago

Quantizers can vary per plane, but not per block. Move these values to
the per-plane part of MACROBLOCK.

Change-Id: I320a55e38b7b28b29aec751a4aca5ccd0c9b9326

5c649f67

23 Apr, 2013 - 1 commit

Convert coeff to per-plane MACROBLOCK data · 138ec38c

John Koleszar authored 11 years ago

This commit moves the coeff storage from the MACROBLOCK struct to its
per-plane part. The next commit will remove the coeff member from the
BLOCK structure so that it is consistently accessed per-plane.

Also refactors vp9_sb_block_error_c and vp9_sb_uv_block_error_c to be
variable subsampling aware.

Change-Id: I18c30f87f27c3a012119b6c1970d5fa499804455

138ec38c

22 Apr, 2013 - 2 commits
- Adding get_scan_{4x4, 8x8, 16x16} functions. · 5de7e16c
  Dmitry Kovalev authored 11 years ago
```
Change-Id: Id4306ef6d65d4a3984aed50b775bdf48d4f6c438
```
  5de7e16c
- Removes the code_nonzerocount experiment · 0aa79be7
  Deb Mukherjee authored 11 years ago
```
This patch does not seem to give any benefits.

Change-Id: I9d2b4091d6af3dfc0875f24db86c01e2de57f8db
```
  0aa79be7
16 Apr, 2013 - 1 commit
- Renaming y1dc_delta_q, uvdc_delta_q, uvac_delta_q fields from VP9Common. · 1ad7c1f2
  Dmitry Kovalev authored 11 years ago
```
New names are y_dc_delta_q, uv_dc_delta_q, uv_ac_delta_q.

Change-Id: I4acae1fc23a4697ce2c5a5becb8dc28ef0a4b552
```
  1ad7c1f2
15 Apr, 2013 - 1 commit
- Fix width/height switch-up in U/V SB quantize code. · f551c2d1
  Ronald S. Bultje authored 11 years ago
```
Change-Id: I697514efd6024e1b4153bbde58ae5e323b030981
```
  f551c2d1
11 Apr, 2013 - 1 commit

Remove unused macroblock versions of reconstruction functions. · 13e41ba4

Ronald S. Bultje authored 11 years ago

More specifically, remove vp9_quantize_mb*, vp9_optimize_mb*,
vp9_inverse_transform_mb* and vp9_transform_mb*. Instead, use the
generic _sb* functions that take a size argument, and call them with
BLOCK_SIZE_MB16X16.

Change-Id: I33024afea95d3a23ffbc1df7da426e4645110f29

13e41ba4

10 Apr, 2013 - 1 commit

Make SB coding size-independent. · a3874850

Ronald S. Bultje authored 11 years ago

Merge sb32x32 and sb64x64 functions; allow for rectangular sizes. Code
gives identical encoder results before and after. There are a few
macros for rectangular block sizes under the sbsegment experiment; this
experiment is not yet functional and should not yet be used.

Change-Id: I71f93b5d2a1596e99a6f01f29c3f0a456694d728

a3874850

09 Apr, 2013 - 1 commit

Fixing upper case names. · c34f6fcb

Dmitry Kovalev authored 11 years ago

Renaming Y1dequant to y_dequant, UVdequant to uv_dequant, QIndex to qindex.

Change-Id: I1c356e5f886deb3f8807dc212de9799b55b09d58

c34f6fcb

05 Apr, 2013 - 1 commit

Move EOB to per-plane data · 05a79f2f

John Koleszar authored 11 years ago

Continue migrating data from BLOCKD/MACROBLOCKD to the per-plane
structures.

Change-Id: Ibbfa68d6da438d32dcbe8df68245ee28b0a2fa2c

05a79f2f

04 Apr, 2013 - 1 commit

Move qcoeff, dqcoeff from BLOCKD to per-plane data · 4c05a051

John Koleszar authored 11 years ago

Start grouping data per-plane, as part of refactoring to support
additional planes, and chroma planes with other-than 4:2:0
subsampling.

Change-Id: Idb76a0e23ab239180c818025bae1f36f1608bb23

4c05a051

26 Mar, 2013 - 2 commits

Add col/row-based coefficient scanning patterns for 1D 8x8/16x16 ADSTs. · d9094d8f

Ronald S. Bultje authored 12 years ago

These are mostly just for experimental purposes. I saw small gains (in
the 0.1% range) when playing with this on derf.

Change-Id: Ib21eed477bbb46bddcd73b21c5c708a5b46abedc

d9094d8f

Modeling default coef probs with distribution · fd18d5df

Deb Mukherjee authored 12 years ago

Replaces the default tables for single coefficient magnitudes with
those obtained from an appropriate distribution. The EOB node
is left unchanged. The model is represeted as a 256-size codebook
where the index corresponds to the probability of the Zero or the
One node. Two variations are implemented corresponding to whether
the Zero node or the One-node is used as the peg. The main advantage
is that the default prob tables will become considerably smaller and
manageable. Besides there is substantially less risk of over-fitting
for a training set.

Various distributions are tried and the one that gives the best
results is the family of Generalized Gaussian distributions with
shape parameter 0.75. The results are within about 0.2% of fully
trained tables for the Zero peg variant, and within 0.1% of the
One peg variant.

The forward updates are optionally (controlled by a macro)
model-based, i.e. restricted to only convey probabilities from the
codebook. Backward updates can also be optionally (controlled by
another macro) model-based, but is turned off by default. Currently
model-based forward updates work about the same as unconstrained
updates, but there is a drop in performance with backward-updates
being model based.

The model based approach also allows the probabilities for the key
frames to be adjusted from the defaults based on the base_qindex of
the frame. Currently the adjustment function is a placeholder that
adjusts the prob of EOB and Zero node from the nominal one at higher
quality (lower qindex) or lower quality (higher qindex) ends of the
range. The rest of the probabilities are then derived based on the
model from the adjusted prob of zero.

Change-Id: Iae050f3cbcc6d8b3f204e8dc395ae47b3b2192c9

fd18d5df

07 Mar, 2013 - 3 commits

Consistent usage of ROUND_POWER_OF_TWO macro. · 3603dfb6
Dmitry Kovalev authored 12 years ago
```
Change-Id: I44660975e9985310d8c654c158ee7a61291b5a08
```
3603dfb6

Re-add support for ADST in superblocks. · d3724abe

Ronald S. Bultje authored 12 years ago

This also changes the RD search to take account of the correct block
index when searching (this is required for ADST positioning to work
correctly in combination with tx_select).

Change-Id: Ie50d05b3a024a64ecd0b376887aa38ac5f7b6af6

d3724abe

Coding con-zero count rather than EOB for coeffs · eb6ef241

Deb Mukherjee authored 12 years ago

This patch revamps the entropy coding of coefficients to code first
a non-zero count per coded block and correspondingly remove the EOB
token from the token set.

STATUS:
Main encode/decode code achieving encode/decode sync - done.
Forward and backward probability updates to the nzcs - done.
Rd costing updates for nzcs - done.
Note: The dynamic progrmaming apporach used in trellis quantization
is not exactly compatible with nzcs. A suboptimal approach has been
used instead where branch costs are updated to account for changes
in the nzcs.

TODO:
Training the default probs/counts for nzcs

Change-Id: I951bc1e22f47885077a7453a09b0493daa77883d

eb6ef241

05 Mar, 2013 - 1 commit

Make superblocks independent of macroblock code and data. · 111ca421

Ronald S. Bultje authored 12 years ago

Split macroblock and superblock tokenization and detokenization
functions and coefficient-related data structs so that the bitstream
layout and related code of superblock coefficients looks less like it's
a hack to fit macroblocks in superblocks.

In addition, unify chroma transform size selection from luma transform
size (i.e. always use the same size, as long as it fits the predictor);
in practice, this means 32x32 and 64x64 superblocks using the 16x16 luma
transform will now use the 16x16 (instead of the 8x8) chroma transform,
and 64x64 superblocks using the 32x32 luma transform will now use the
32x32 (instead of the 16x16) chroma transform.

Lastly, add a trellis optimize function for 32x32 transform blocks.

HD gains about 0.3%, STDHD about 0.15% and derf about 0.1%. There's
a few negative points here and there that I might want to analyze
a little closer.

Change-Id: Ibad7c3ddfe1acfc52771dfc27c03e9783e054430

111ca421

27 Feb, 2013 - 1 commit
- Move eob from BLOCKD to MACROBLOCKD. · e8c74e2b
  Ronald S. Bultje authored 12 years ago
```
Consistent with VP8.

Change-Id: I8c316ee49f072e15abbb033a80e9c36617891f07
```
  e8c74e2b