Commits · d4158283e79fbdf684ec4f2f983509b793c59090 · BC / public / external / libvpx

02 Jul, 2013 - 1 commit

use partitioning from last frame · d4158283

Jim Bankoski authored 11 years ago


This cl converts use partition from last frame to do the following:

if part is none,horz, vert -> try split
if part != none and one of the children is not split - try none


Change-Id: I5b6c659e35f3ac9f11c051b92ba98af6d7e8aa87
Signed-off-by: Jim Bankoski <jimbankoski@google.com>

d4158283

01 Jul, 2013 - 12 commits
- Merge "Quantize (64-bit only, for now) SSSE3 SIMD." · ba3b2604
  Yaowu Xu authored 11 years ago
  
  ba3b2604
- Merge "Removing vp9_modecont.{h, c}." · 6411228a
  Dmitry Kovalev authored 11 years ago
  
  6411228a
- Merge "Moving encoder subexp encoding functions to subexp.{h, c}." · d9db0d96
  Dmitry Kovalev authored 11 years ago
  
  d9db0d96
- Merge "Adding vp9_rb_read_signed_literal function." · a4e14d7f
  Dmitry Kovalev authored 11 years ago
  
  a4e14d7f
- Merge "Inlining decode_atom, decode_sb_intra, and decode_sb." · 33fffc15
  Dmitry Kovalev authored 11 years ago
  
  33fffc15
- Merge "Cleanup inside vp9_decodemv.c." · 2381d464
  Dmitry Kovalev authored 11 years ago
  
  2381d464
- Quantize (64-bit only, for now) SSSE3 SIMD. · 7353ceab
  Ronald S. Bultje authored 11 years ago
```
Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps
goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is
x86-64 only, it needs some minor modifications to be 32bit compatible,
because it uses 15 xmm registers, whereas 32bit only has 8.

Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904
```
  7353ceab
- Removing vp9_modecont.{h, c}. · 2ab3bc88
  Dmitry Kovalev authored 11 years ago
```
Moving vp9_default_inter_mode_probs array to vp9_entropymode.c.

Change-Id: I88ebda86ccc07f2a43c6c01d4b37898214cfb6de
```
  2ab3bc88
- Merge "New motion threshold factor - speed feature." · 7bb436fe
  Paul Wilkins authored 11 years ago
  
  7bb436fe
- fix a mismatch in cpuused 2 · 632289b3
  Yaowu Xu authored 11 years ago
```
Change-Id: I921c9faba6386535aaf717a54301dd346a9b8540
```
  632289b3
- New motion threshold factor - speed feature. · 13772781
  Paul Wilkins authored 11 years ago
```
Added a speed feature that focuses only on thresholds
for new motion modes.

Moved sf->comp_inter_joint_search_thresh into speed
1.  This has ~+0.4% impact on quality at speed 0 as
our quality reference baseline.

Slight adjustment to baseline thresholds.

Change-Id: I7ebf104f1fe29af77ed4837b2e84be065621bbe5
```
  13772781
- Adding vp9_rb_read_signed_literal function. · e5e15eb3
  Dmitry Kovalev authored 11 years ago
```
Change-Id: I30ea91561ffac7e5065ba41b2d3ab7dedb720593
```
  e5e15eb3
29 Jun, 2013 - 14 commits
- Merge "Enable SSE2 4x4 ADST/DCT transform" · 993942ce
  Jingning Han authored 11 years ago
  
  993942ce
- SSE2 version of vp9_short_fdct32x32_rd. · 466e0cf3
  Christian Duvivier authored 11 years ago
```
43,000 -> 5,750 cycles, about 7.5x faster.

Change-Id: Ibfd92821b9603f4ed9c256e0ececec14fa4565d0
```
  466e0cf3
- Moving encoder subexp encoding functions to subexp.{h, c}. · bb8ccf1c
  Dmitry Kovalev authored 11 years ago
```
Change-Id: I83ca53bf6def871f199a382a671f26ad7cbecbca
```
  bb8ccf1c
- Merge "fixed a bug where sse is not populated" · bc70c60b
  Ronald S. Bultje authored 11 years ago
  
  bc70c60b
- Merge "add Neon optimized add constant residual functions" · 6098e359
  Johann authored 11 years ago
  
  6098e359
- Merge "fix test compile error" · 84d08fa9
  James Zern authored 11 years ago
  
  84d08fa9
- Merge "Inline vp9_get_coef_context() (and remove vp9_ prefix)." · a487af8d
  Ronald S. Bultje authored 11 years ago
  
  a487af8d
- Merge "Minor change to prevent one level of dereference in cost_coeffs()." · 7731e538
  Ronald S. Bultje authored 11 years ago
  
  7731e538
- add Neon optimized add constant residual functions · a83cfd4d
  chm authored 11 years ago
```
- Add add_constant_residual_8x8 16x16 32x32 functions
- Tested under RealView debugger enviroment

Change-Id: I5c3a432f651b49bf375de6496353706a33e3e68e
```
  a83cfd4d
- Merge "Cosmetic reordering of FRAME_CONTEXT members." · d6264f9a
  Dmitry Kovalev authored 11 years ago
  
  d6264f9a
- Inlining decode_atom, decode_sb_intra, and decode_sb. · 1947828c
  Dmitry Kovalev authored 11 years ago
```
Change-Id: I41711bb994f542c5ba3d0cefd9b2e79db3c2c3a1
```
  1947828c
- fix test compile error · a63e31e8
  James Zern authored 11 years ago
```
since:
92479d95 Make update_partition_context faster

fixes:
vp9/common/vp9_blockd.h:408:22: error:
non-constant-expression cannot be narrowed from type 'int' to 'char' in
initializer list [-Wc++11-narrowing]
  char pcvalue[2] = {~(0xe << boffset), ~(0xf <<boffset)};
                     ^~~~~~~~~~~~~~~~~

Change-Id: Id5b00b9a72d00a2b314081a23879bd1fa3ce983b
```
  a63e31e8
- Enable SSE2 4x4 ADST/DCT transform · 1109b6b8
  Jingning Han authored 11 years ago
```
This commit enables SSE2 4x4 foward hybrid transform. The runtime
goes from 249 cycles down to 74 cycles. Overall around 2% speed-up
at no compression performance change.

Change-Id: Iad4d526346e05c7be896466c05500711bb763660
```
  1109b6b8
- fixed a bug where sse is not populated · f853e662
  Yaowu Xu authored 11 years ago
```
Change-Id: I692d800af1f976c84a76f8bd66864c4b39540abc
```
  f853e662
28 Jun, 2013 - 13 commits

Merge "Fix switch statement in 8x8 transform" · 07b72ace
Jingning Han authored 11 years ago

07b72ace
Cosmetic reordering of FRAME_CONTEXT members. · 228b8232
Dmitry Kovalev authored 11 years ago
```
Change-Id: Id641e5188adf55e53e606e5813ae45feaf7abbd2
```
228b8232

Cleanup inside vp9_decodemv.c. · 15fefced

Dmitry Kovalev authored 11 years ago

Adding read_skip_coeff function. Renaming decode_mv to read_mv for
consistency with another function names. Removing redundant function
arguments. Renaming kfread_modes to read_intra_mode_info, read_mb_modes_mv
to read_inter_mode_info, vp9_decode_mb_mode_mv to vp9_read_mode_info,
vp9_decode_mode_mvs_init to vp9_prepare_read_mode_info. Inlining function
mb_mode_mv_init inside vp9_prepare_read_mode_info.

Change-Id: Ifee05d333da4cd331d4aff40ce41ccd9b70e494a

15fefced

Merge "Removing CONFIG_DEBUG checks on assertions." · 59070f6e
Dmitry Kovalev authored 11 years ago

59070f6e
Fix switch statement in 8x8 transform · 9def7f72
Jingning Han authored 11 years ago
```
Change-Id: I7c46354c4983feb5f6202c3ab4a1d9534da7e30f
```
9def7f72
Merge "Some minor optimizations for cost_coeffs()." · cee3bc6f
Ronald S. Bultje authored 11 years ago

cee3bc6f
Merge "Make coefficient skip condition an explicit RD choice." · ec5d09b9
Ronald S. Bultje authored 11 years ago

ec5d09b9

Inline vp9_get_coef_context() (and remove vp9_ prefix). · d00b8e5f

Ronald S. Bultje authored 11 years ago

Makes cost_coeffs() a lot faster:
4x4: 236 -> 181 cycles
8x8: 888 -> 588 cycles
16x16: 3550 -> 2483 cycles
32x32: 17392 -> 12010 cycles

Total encode time of first 50 frames of bus (speed 0) @ 1500kbps goes
from 2min51.6 to 2min43.9, i.e. 4.7% overall speedup.

Change-Id: I16b8d595946393c8dc661599550b3f37f5718896

d00b8e5f

Merge "Decoder's code cleanup." · 0345fc3a
Dmitry Kovalev authored 11 years ago

0345fc3a

Removing CONFIG_DEBUG checks on assertions. · 8e6ce6bb

Dmitry Kovalev authored 11 years ago

Adding CHECK_MEM_ERROR macro to vp9_common.h and removing two duplicated
ones from vp9_onyx_int.h and vp9_onyxd_int.h.

Change-Id: I916afec61b3019f18193135dac7c35ed0f89b8b6

8e6ce6bb

Minor change to prevent one level of dereference in cost_coeffs(). · e3ce2b2a

Ronald S. Bultje authored 11 years ago

4x4: 234 -> 236 cycles
8x8: 878 -> 888 cycles
16x16: 3664 -> 3550 cycles
32x32: 18134 -> 17392 cycles

Change-Id: I37a51bfbb0060a3a54f09c6045c14a989811ed78

e3ce2b2a

Some minor optimizations for cost_coeffs(). · 91d223bd

Ronald S. Bultje authored 11 years ago

Cycle timings for first 3 frames of bus (speed 0) at 1500kbps:
4x4: 298 -> 234 cycles
8x8: 1227 -> 878 cycles
16x16: 23426 -> 18134 cycles
32x32: 4906 -> 3664 cycles

Total encode time of first 50 frames of bus @ 1500kbps (speed 0) goes
from 3min0.7 to 2min51.6 seconds, i.e. 5.3% faster.

Change-Id: I68a0e1b530b0563b84a67342cca4b45146077e95

91d223bd

Make coefficient skip condition an explicit RD choice. · af660715

Ronald S. Bultje authored 11 years ago

This commit replaces zrun_zbin_boost, a method of biasing non-zero
coefficients following runs of zero-coefficients to be rounded towards
zero, with an explicit skip-block choice in the RD loop.

The logic is basically that if individual coefficients should be rounded
towards zero (from a RD point of view), the trellis/optimize loop should
take care of it. If whole blocks should be zero (from a RD point of
view), a single RD check is much more efficient than a complete
serialization of the quantization loop.

Quality change: derf +0.5% psnr, +1.6% ssim; yt +0.6% psnr, +1.1% ssim.
SIMD for quantize will follow in a separate patch. Results for other
test sets pending.

Change-Id: Ife5fa641163ac5150ac428011e87188f1937c1f4

af660715