Commits · 70fe2b3ec34da0d03acfb3748a3132a9c8b4b5a0 · BC / public / external / libvpx

16 Jul, 2013 - 21 commits
- Merge "Cosmetic changes in 4x4 and 8x8 fdct unit tests" · 70fe2b3e
  James Zern authored 11 years ago
  
  70fe2b3e
- Merge "VP[89]_COMMON: remove unused near_boffset" · c0562d08
  James Zern authored 11 years ago
  
  c0562d08
- Merge "VP9_COMMON: remove unused framerate/bitrate" · 63e914bd
  James Zern authored 11 years ago
  
  63e914bd
- Merge "yv12config: remove YUV_TYPE" · 3a7c2665
  James Zern authored 11 years ago
  
  3a7c2665
- Merge "Replace generated quant tables with static lookup tables." · 58a20053
  Ronald S. Bultje authored 11 years ago
  
  58a20053
- Replace generated quant tables with static lookup tables. · e965cccc
  Ronald S. Bultje authored 11 years ago
```
This prevents possible float rounding issues between architectures.

Change-Id: I6ed260aebd49feb4cfb5596a5370c44be5f72167
```
  e965cccc
- Merge "Fix above context pointers" · cc1aac1b
  John Koleszar authored 11 years ago
  
  cc1aac1b
- Merge "SSE2 8x8 inverse ADST/DCT transform" · 58519047
  Jingning Han authored 11 years ago
  
  58519047
- Fix above context pointers · 5efd9609
  John Koleszar authored 11 years ago
```
In the prior code, the above context pointers used for entropy
decoding were initialized on the first frame, and not updated when
the frame size changed. The per-frame code which initializes the
contexts assumes that the contexts are contiguous, leading to an
incomplete initialization when the frame is smaller. This commit
updates the pointers so that the context is contigous whenever
the frame size changes.

Change-Id: I08b53e3a30c8289491212311682ff1b8028cff6c
```
  5efd9609
- Merge "vp9_convolve8_[horiz|vert]_avg" · 90ebfe62
  Johann authored 11 years ago
  
  90ebfe62
- Merge "Skip inter-coded block reconstruction in rd loop" · dd97c62a
  Jingning Han authored 11 years ago
  
  dd97c62a
- Merge "Removing and moving around constant definitions." · e8e7620a
  Dmitry Kovalev authored 11 years ago
  
  e8e7620a
- Merge "Change to extend full border only when needed" · c5b0cd84
  Yaowu Xu authored 11 years ago
  
  c5b0cd84
- Change to extend full border only when needed · 5b915ebd
  Yaowu Xu authored 11 years ago
```
This is a short term optimization till we work out a decoder
implementation requiring no frame border extension.

Change-Id: I02d15bfde4d926b50a4e58b393d8c4062d1be70f
```
  5b915ebd
- Removing and moving around constant definitions. · ca75f125
  Dmitry Kovalev authored 11 years ago
```
Removing unused and duplicated constants, moving them from *.h to *.c
if possible.

Change-Id: Ief4d6b984a3ca2e9b38504f0d855ed072cf7133f
```
  ca75f125
- Merge "Consistent naming for loop-filter filters." · 65762849
  Dmitry Kovalev authored 11 years ago
  
  65762849
- Merge "Remove print_nmvcounts" · 6eae37f4
  Johann authored 11 years ago
  
  6eae37f4
- Increase border size from 96 to 160. · b02c4d36
  Ronald S. Bultje authored 11 years ago
```
This is required because upon downscaling, if a motion vector points
partially into the UMV (e.g. all minus 1 of 64+7 pixels, i.e. 70),
then we can point up to 140 pixels into the larger-resolution (2x)
reference buffer UMV, which means the UMV for reference buffers in
downscaling needs to be 140 rounded up to the nearest multiple of 32,
i.e. 160.

Longer-term, we should probably handle the UMV differently by detecting
edge coverage on-the-fly and using a temporary buffer for edge extensions
instead of adding 160 pixels on all sides of the image (which means a
CIF image uses 3x its own area size for borders).

Change-Id: I5184443e6731cd6721fc6a5d430a53e7d91b4f7e
```
  b02c4d36
- Inline vp9_quantize() in xform_quant(). · 1ff94fea
  Ronald S. Bultje authored 11 years ago
```
Cycle times:
4x4:    151 to  131 cycles (15% faster)
8x8:    334 to  306 cycles (9% faster)
16x16: 1401 to 1368 cycles (2.5% faster)
32x32: 7403 to 7367 cycles (0.5% faster)

Total encode time of first 50 frames of bus @ 1500kbps (speed 0)
goes from 1min39.2 to 1min38.6, i.e. a 0.67% overall speedup.

Change-Id: I799a49460e5e3fcab01725564dd49c629bfe935f
```
  1ff94fea
- Merge "Inline xform_quant() in encode_block_intra()." · 7e684e20
  Ronald S. Bultje authored 11 years ago
  
  7e684e20
- Merge "Neon: Update mbfilter if all vectors follow one branch." · ce1d69ae
  Frank Galligan authored 11 years ago
  
  ce1d69ae
15 Jul, 2013 - 8 commits

Consistent naming for loop-filter filters. · e973b4e2

Dmitry Kovalev authored 11 years ago

Renaming flatmask4 to flat_mask4, flatmask5 to flat_mask5, hevmask to
hev_mask, filter to filter4, mbfilter to filter8, wide_mbfilter to
filter16.

Change-Id: Ic61c73e59c2eee505257584867aafac99833cea1

e973b4e2

Inline xform_quant() in encode_block_intra(). · 6fb41874

Ronald S. Bultje authored 11 years ago

Also inline some of the block calculations to assist the compiler to
not do silly things like calculating the same offset (or converting
between raster/transform block offset or block, mi and pixel unit)
many, many, many times.

Cycle times:
4x4:     584 ->   505 cycles (16% faster)
8x8:    1651 ->  1560 cycles (6% faster)
16x16:  7897 ->  7704 cycles (2.5% faster)
32x32: 16096 -> 15852 cycles (1.5% faster)

Overall, this saves about 0.5 seconds (1min49.8 -> 1min49.3) on the
first 50 frames of bus (speed 0) @ 1500kbps, i.e. 0.5% overall.

Change-Id: If3dd62453f8e2ab9d4ee616bc4ea956fb8874b80

6fb41874

Code cleanup inside vp9_decodeframe.c. · 2c317298

Dmitry Kovalev authored 11 years ago

Removing unused DEC_DEBUG define and dec_debug variable. Changing function
signatures to eliminate code duplication, renaming function
mb_init_dequantizer to init_dequantizer. Also removing redundant curly
braces, and comments.


Change-Id: Ia56ee1b0be5f24abb0e878581845be8a4773c298

2c317298

Neon: Update mbfilter if all vectors follow one branch. · f4f60f60

Frank Galligan authored 11 years ago

Change the mbfilter Neon code from executing both branches if all
vectors follow only one branch.

The code is about 5% faster when executing only one branch and about
1% slower when executing both branches.

-PS5: Remove local stack space from mbfilter.

Change-Id: I6a23f9b318a9f4568a2718b4c9348db988fe2182

f4f60f60

Cosmetic changes in 4x4 and 8x8 fdct unit tests · 6094bf37
Jingning Han authored 11 years ago
```
Make the codes consistent with conventions.

Change-Id: Id044ed8382f83a3c3f54f9edd569f00bcd0523db
```
6094bf37

Skip inter-coded block reconstruction in rd loop · 043e0f9d

Jingning Han authored 11 years ago

Skip the inverse transform and reconstruction of inter-mode coded
blocks in the rate-distortion optimization loop, when skip_encode_sb
feature is turned on. This provides about 1% speed-up at speed 0,
and 1.5% speed-up at speed 1. No performance change in both settings.

Change-Id: I2932718bf4d007163702b61b16b6ff100cf9d007

043e0f9d

Skip duplicate block encoding in the rd loop · faff6ed0

Jingning Han authored 11 years ago

This speed feature allows the encoder to largely remove the spatial
dependency between blocks inside a 64x64 superblock, thereby removing
the need to repeatedly encode superblocks per partition type in the
rate-distortion optimization loop.

A major challenge lies in the intra modes tested in the rate-distortion
optimization loop. The subsequent blocks do not have access to the
reconstructed boundary pixels without the intermediate coding steps.
This was resolved by using the original pixels for intra prediction
in the rd loop, followed by an appropriately designed distortion
modeling on the quantization parameters. Experiments also suggested
that the performance impact is more discernible at lower bit-rate/psnr
settings. Hence a quantizer dependent threshold is applied to deactivate
skip of block coding.

For bus_cif at 2000 kbps,
speed 0: runtime 269854ms -> 237774ms (12% speed-up) at 0.05dB
         performance loss.

speed 1: runtime 65312ms  -> 61536ms, (7...

faff6ed0

Merge "Fixing vp9_get_pred_context_comp_ref_p function." · 1f14bbb6
Dmitry Kovalev authored 11 years ago

1f14bbb6

13 Jul, 2013 - 6 commits

VP9_COMMON: remove unused framerate/bitrate · 04092764

James Zern authored 11 years ago

+ VP8_COMMON: place them under CONFIG_POSTPROC_VISUALIZER

Change-Id: I2702d5a3e1134b9c5f7ddc14b4173955a400f2cf

04092764

SSE2 8x8 inverse ADST/DCT transform · 91365add

Jingning Han authored 11 years ago

This commit enables SSE2 implementation of 8x8 inverse ADST/DCT
transform. The runtime goes from 1216 cycles -> 266 cycles.
For bus_cif at 2000 kbps, the overall runtime reduces from
253707ms -> 248430ms, i.e., 2% speed-up at speed 0.

Change-Id: Ib0372e17e9162d7b11a10d653b1c8be547c878fb

91365add

VP[89]_COMMON: remove unused near_boffset · ce0324d8
James Zern authored 11 years ago
```
Change-Id: If9b9ca703b997312df85241a0758d414cfdc5228
```
ce0324d8
Using vp9_copy and vp9_zero instead of custom code. · 42907098
Dmitry Kovalev authored 11 years ago
```
Change-Id: Id9b6ceeddca3f9b34bfada5c499b1e7a2f42c30b
```
42907098

Fixing vp9_get_pred_context_comp_ref_p function. · 31a68bcd

Dmitry Kovalev authored 11 years ago

Adding missed parenthesis around boolean expressions. Bitstream is changed.
Regenerating test vectors.

Change-Id: I4cc00b761e9473f92f180a9fc3a0c607f0aaae56

31a68bcd

Merge "Removing redundant call to set_mi_row_col." · 31403080
Dmitry Kovalev authored 11 years ago

31403080

12 Jul, 2013 - 5 commits
- Removing redundant call to set_mi_row_col. · 3c94fffd
  Dmitry Kovalev authored 11 years ago
```
This function is actually called from set_offsets which is called right
before vp9_read_mode_info.

Change-Id: Ibb9d5ad606194bc80eab264fad85b31c9dfd8f77
```
  3c94fffd
- vp9_convolve8_[horiz|vert]_avg · a15bebfc
  Johann authored 11 years ago
```
Super basic conversion from the other implementations. Any changes to
one should be trivial to copy over keep in sync.

Change-Id: I1720b4128e0aba4b2779e3761f6494f8a09d3ea8
```
  a15bebfc
- Merge "Fix a build issue" · cdea4a7c
  Yaowu Xu authored 11 years ago
  
  cdea4a7c
- Merge "Adding struct tx_probs and struct tx_counts to cleanup the code." · aa518af8
  Dmitry Kovalev authored 11 years ago
  
  aa518af8
- Merge "Making functions read_{inter, intra}_segment_id more similar." · 444c8d4c
  Dmitry Kovalev authored 11 years ago
  
  444c8d4c