Commits · 632289b31fd11229c875c116f4281e3ab6f42115 · BC / public / external / libvpx

01 Jul, 2013 - 1 commit
- fix a mismatch in cpuused 2 · 632289b3
  Yaowu Xu authored 11 years ago
```
Change-Id: I921c9faba6386535aaf717a54301dd346a9b8540
```
  632289b3
29 Jun, 2013 - 12 commits
- Merge "Enable SSE2 4x4 ADST/DCT transform" · 993942ce
  Jingning Han authored 11 years ago
  
  993942ce
- SSE2 version of vp9_short_fdct32x32_rd. · 466e0cf3
  Christian Duvivier authored 11 years ago
```
43,000 -> 5,750 cycles, about 7.5x faster.

Change-Id: Ibfd92821b9603f4ed9c256e0ececec14fa4565d0
```
  466e0cf3
- Merge "fixed a bug where sse is not populated" · bc70c60b
  Ronald S. Bultje authored 11 years ago
  
  bc70c60b
- Merge "add Neon optimized add constant residual functions" · 6098e359
  Johann authored 11 years ago
  
  6098e359
- Merge "fix test compile error" · 84d08fa9
  James Zern authored 11 years ago
  
  84d08fa9
- Merge "Inline vp9_get_coef_context() (and remove vp9_ prefix)." · a487af8d
  Ronald S. Bultje authored 11 years ago
  
  a487af8d
- Merge "Minor change to prevent one level of dereference in cost_coeffs()." · 7731e538
  Ronald S. Bultje authored 11 years ago
  
  7731e538
- add Neon optimized add constant residual functions · a83cfd4d
  chm authored 11 years ago
```
- Add add_constant_residual_8x8 16x16 32x32 functions
- Tested under RealView debugger enviroment

Change-Id: I5c3a432f651b49bf375de6496353706a33e3e68e
```
  a83cfd4d
- Merge "Cosmetic reordering of FRAME_CONTEXT members." · d6264f9a
  Dmitry Kovalev authored 11 years ago
  
  d6264f9a
- fix test compile error · a63e31e8
  James Zern authored 11 years ago
```
since:
92479d95 Make update_partition_context faster

fixes:
vp9/common/vp9_blockd.h:408:22: error:
non-constant-expression cannot be narrowed from type 'int' to 'char' in
initializer list [-Wc++11-narrowing]
  char pcvalue[2] = {~(0xe << boffset), ~(0xf <<boffset)};
                     ^~~~~~~~~~~~~~~~~

Change-Id: Id5b00b9a72d00a2b314081a23879bd1fa3ce983b
```
  a63e31e8
- Enable SSE2 4x4 ADST/DCT transform · 1109b6b8
  Jingning Han authored 11 years ago
```
This commit enables SSE2 4x4 foward hybrid transform. The runtime
goes from 249 cycles down to 74 cycles. Overall around 2% speed-up
at no compression performance change.

Change-Id: Iad4d526346e05c7be896466c05500711bb763660
```
  1109b6b8
- fixed a bug where sse is not populated · f853e662
  Yaowu Xu authored 11 years ago
```
Change-Id: I692d800af1f976c84a76f8bd66864c4b39540abc
```
  f853e662
28 Jun, 2013 - 19 commits

Merge "Fix switch statement in 8x8 transform" · 07b72ace
Jingning Han authored 11 years ago

07b72ace
Cosmetic reordering of FRAME_CONTEXT members. · 228b8232
Dmitry Kovalev authored 11 years ago
```
Change-Id: Id641e5188adf55e53e606e5813ae45feaf7abbd2
```
228b8232
Merge "Removing CONFIG_DEBUG checks on assertions." · 59070f6e
Dmitry Kovalev authored 11 years ago

59070f6e
Fix switch statement in 8x8 transform · 9def7f72
Jingning Han authored 11 years ago
```
Change-Id: I7c46354c4983feb5f6202c3ab4a1d9534da7e30f
```
9def7f72
Merge "Some minor optimizations for cost_coeffs()." · cee3bc6f
Ronald S. Bultje authored 11 years ago

cee3bc6f
Merge "Make coefficient skip condition an explicit RD choice." · ec5d09b9
Ronald S. Bultje authored 11 years ago

ec5d09b9

Inline vp9_get_coef_context() (and remove vp9_ prefix). · d00b8e5f

Ronald S. Bultje authored 11 years ago

Makes cost_coeffs() a lot faster:
4x4: 236 -> 181 cycles
8x8: 888 -> 588 cycles
16x16: 3550 -> 2483 cycles
32x32: 17392 -> 12010 cycles

Total encode time of first 50 frames of bus (speed 0) @ 1500kbps goes
from 2min51.6 to 2min43.9, i.e. 4.7% overall speedup.

Change-Id: I16b8d595946393c8dc661599550b3f37f5718896

d00b8e5f

Merge "Decoder's code cleanup." · 0345fc3a
Dmitry Kovalev authored 11 years ago

0345fc3a

Removing CONFIG_DEBUG checks on assertions. · 8e6ce6bb

Dmitry Kovalev authored 11 years ago

Adding CHECK_MEM_ERROR macro to vp9_common.h and removing two duplicated
ones from vp9_onyx_int.h and vp9_onyxd_int.h.

Change-Id: I916afec61b3019f18193135dac7c35ed0f89b8b6

8e6ce6bb

Minor change to prevent one level of dereference in cost_coeffs(). · e3ce2b2a

Ronald S. Bultje authored 11 years ago

4x4: 234 -> 236 cycles
8x8: 878 -> 888 cycles
16x16: 3664 -> 3550 cycles
32x32: 18134 -> 17392 cycles

Change-Id: I37a51bfbb0060a3a54f09c6045c14a989811ed78

e3ce2b2a

Some minor optimizations for cost_coeffs(). · 91d223bd

Ronald S. Bultje authored 11 years ago

Cycle timings for first 3 frames of bus (speed 0) at 1500kbps:
4x4: 298 -> 234 cycles
8x8: 1227 -> 878 cycles
16x16: 23426 -> 18134 cycles
32x32: 4906 -> 3664 cycles

Total encode time of first 50 frames of bus @ 1500kbps (speed 0) goes
from 3min0.7 to 2min51.6 seconds, i.e. 5.3% faster.

Change-Id: I68a0e1b530b0563b84a67342cca4b45146077e95

91d223bd

Make coefficient skip condition an explicit RD choice. · af660715

Ronald S. Bultje authored 11 years ago

This commit replaces zrun_zbin_boost, a method of biasing non-zero
coefficients following runs of zero-coefficients to be rounded towards
zero, with an explicit skip-block choice in the RD loop.

The logic is basically that if individual coefficients should be rounded
towards zero (from a RD point of view), the trellis/optimize loop should
take care of it. If whole blocks should be zero (from a RD point of
view), a single RD check is much more efficient than a complete
serialization of the quantization loop.

Quality change: derf +0.5% psnr, +1.6% ssim; yt +0.6% psnr, +1.1% ssim.
SIMD for quantize will follow in a separate patch. Results for other
test sets pending.

Change-Id: Ife5fa641163ac5150ac428011e87188f1937c1f4

af660715

Merge "Minor cleanups" · 1b5421f3
Yaowu Xu authored 11 years ago

1b5421f3
Merge "Optimize partition search order" · 64bb996e
Yaowu Xu authored 11 years ago

64bb996e
Minor cleanups · 8b9eea0a
Yaowu Xu authored 11 years ago
```
Change-Id: I379617c1c731a686b3f7e032b8805860c1055b12
```
8b9eea0a

Optimize partition search order · 1374a06b

Yaowu Xu authored 11 years ago

This commit change the partition search order to allow checking of
rectangular partition to be done after square partitions. It also
added a speed feature to skip rectangular partition check when
NONE is better than SPLIT in RD sense.

This feature roughly speed up encoder by 1.5X with loss on compression
-0.91% on cif set
-0.56% on stdhd set

Change-Id: I0d2d06993041aa9ea9073fcc39c54f73a127dfa4

1374a06b

Merge "Fix tile independence with both column tiling and static_thresh set." · 5de08054
Ronald S. Bultje authored 11 years ago

5de08054
Merge "variance_test: add missing ClearSystemState..." · 5ec57c91
James Zern authored 11 years ago

5ec57c91
Fix tile independence with both column tiling and static_thresh set. · fd4eed3b
Ronald S. Bultje authored 11 years ago
```
Change-Id: I0b2be0ec2c410a527f88b95a44f24ac967b2dac1
```
fd4eed3b

27 Jun, 2013 - 8 commits
- Decoder's code cleanup. · 3231da0a
  Dmitry Kovalev authored 11 years ago
```
Using vp9_set_pred_flag function instead of custom code, adding
decode_tokens function which is now called from decode_atom,
decode_sb_intra, and decode_sb.

Change-Id: Ie163a7106c0241099da9c5fe03069bd71f9d9ff8
```
  3231da0a
- Add Neon optimized loop filter functions. · 1d6dc1b7
  Frank Galligan authored 11 years ago
```
- Added vp9_loop_filter_horizontal_edge_neon and
  vp9_loop_filter_vertical_edge_neon.
- The functions are based off the vp8 loopfilter
  functions.
- Matches x86 md5 checksum.

Change-Id: Id1c4dddb03584227e5ecd29f574a6ac27738fdd0
```
  1d6dc1b7
- Merge "General cleanup in segmentation-related code." · a3664258
  Dmitry Kovalev authored 11 years ago
  
  a3664258
- Merge "Moving subexp encoding functions in separate vp9_dsubexp.c file." · be83ef31
  Dmitry Kovalev authored 11 years ago
  
  be83ef31
- Inline quantize so idiv instruction gets removed from inner loop. · 7a049be6
  Ronald S. Bultje authored 11 years ago
```
Encoding time of first 50 frames of bus @ 1500kbps (speed 0) goes from
3min15.0 to 3min10.9, i.e. 2.1% faster overall.

Change-Id: If592ee99be09bcd34a7c8498347f44e7305e982c
```
  7a049be6
- Merge "Auto adapt step size feature." · 05ffdf26
  Paul Wilkins authored 11 years ago
  
  05ffdf26
- Merge "Start adaptive threshold for each mode at max." · 59af9049
  Paul Wilkins authored 11 years ago
  
  59af9049
- Merge "Change meaning of cpi->sf.first_step and rename." · 5bcf069c
  Paul Wilkins authored 11 years ago
  
  5bcf069c