Commits · debb9c68c8ea92b80627138f95de901cb39cf8dc · BC / public / external / libvpx

07 Aug, 2013 - 4 commits

Use low precision 32x32fdct for encodemb in speed1 · debb9c68

Jingning Han authored 11 years ago

The low precision 32x32 fdct has all the intermediate steps within
16-bit depth, hence allowing faster SSE2 implementation, at the
expense of larger round-trip error. It was used in the rate-distortion
optimization search loop only.

Using the low precision version, in replace of the high precision one,
affects the compression performance by about 0.7% (derf, stdhd) at
speed 0. For speed 1, it makes derf set down by only 0.017%.

Change-Id: I4e7d18fac5bea5317b91c8e7dabae143bc6b5c8b

debb9c68

Neon version of vp9_short_idct4x4_add. · 78182538
Christian Duvivier authored 11 years ago
```
Change-Id: Idec4cae0cb9b3a29835fd2750d354c1393d47aa4
```
78182538
Merge "Clean ups of the subpel search functions" · 296931c8
Deb Mukherjee authored 11 years ago

296931c8

Clean ups of the subpel search functions · 71b43b0f

Deb Mukherjee authored 11 years ago

Removes some unused code and speed features, and organizes the
interfaces for fractional mv step functions for use in new speed
features to come.

In the process a new speed feature - number of iterations per
step during the subpel search - is exposed.

No change when this parameter is set as the original value of 3.

Results:
subpel_iters_per_step = 3: baseline
subpel_iters_per_step = 2: psnr -0.067%, 1% speedup
subpel_iters_per_step = 1: psnr -0.331%, 3-4% speedup

Change-Id: I2eba8a21f6461be8caf56af04a5337257a5693a8

71b43b0f

06 Aug, 2013 - 21 commits

Merge "Motion vector code cleanup." · 63ec0587
Dmitry Kovalev authored 11 years ago

63ec0587
Merge "Place holder for high-precision 32x32 fdct" · 2c091f97
Jingning Han authored 11 years ago

2c091f97

variance x86inc guards · 5b307886

Jim Bankoski authored 11 years ago

also fixed bug in sad calcs

Change-Id: I6571fcbe37556c16ae32be66dc0fd879852aac1d

5b307886

sse3 intrapred x86inc protected · 6eb1254b
Jim Bankoski authored 11 years ago
```
Change-Id: I4a3c83119cdf8a205920034c8019d855d5504605
```
6eb1254b
Merge "Flexible support for various pattern searches" · fac7c8c9
Deb Mukherjee authored 11 years ago

fac7c8c9

sad + miscellaneous updates · c9126e0b

Jim Bankoski authored 11 years ago

Enable use_x86inc as a commandline option.  Fix Bug with sse2 when
x86inc is disabled. Adds Sad asm protection to x86inc protection

Change-Id: Iee0f9dd235ea10e8ace512eb362ba9bebe8c9df6

c9126e0b

Merge "Inlining vp9_get_pred_probs_switchable_interp function." · 8725ca2e
Dmitry Kovalev authored 11 years ago

8725ca2e

Flexible support for various pattern searches · 15b5a6a2

Deb Mukherjee authored 11 years ago

Adds a few pattern searches to achieve various tradeoffs
between motion estimation complexity and performance.
The search framework is unified across these searches so that a
common pattern search function is used for all. Besides it will
be easier to experiment with various patterns or combinations
thereof at different scales in the future.

The new pattern search is multi-scale and is capable of using
different patterns at different scales.

The new hex search uses 8 points at the smallest scale
and 6 points at other scales.
Two other pattern searches - big-diamond and square are
also added. Big diamond uses 4 points at the smallest scale and
8 points in diamond shape at the larger scales.
Square is very similar conceptually to the default n-step search
but is somewhat faster since it keeps only one survivor across
all scales.

Psnr/speed-up results on derf300:

hex: -1.6% psnr%, 6-8% speed-up
big-diamond: -0.96% psnr, 4-5% speedup
square: -0.93% psnr, 4-5% speedup

Change-Id: I02a7ef5193f762601e0994e2c99399a3535a43d2

15b5a6a2

Place holder for high-precision 32x32 fdct · 28566a6c

Jingning Han authored 11 years ago

Resolve compile warnings on re-define FDCT32x32_2D template.

Change-Id: Idb3a54ef8d2710ce7245b726379a0e5c875f5cad

28566a6c

Inlining vp9_get_pred_probs_switchable_interp function. · 0c800656

Dmitry Kovalev authored 11 years ago

There was no benefit having this function. For example, inside
read_switchable_filter_type switchable filter context was calculated twice.

Change-Id: I79cd5bf95cbc0f6d8bf91a2e32289e01b18dcff1

0c800656

Merge "Move fdct32x32 SSE2 implementation in separate file." · 7d61f8fe
Jingning Han authored 11 years ago

7d61f8fe
Merge "intrapred x86inc guards" · efc94102
Jim Bankoski authored 11 years ago

efc94102

Motion vector code cleanup. · a39abe26

Dmitry Kovalev authored 11 years ago

Converting arguments of two functions (clamp_mv_ref, lower_mv_precision)
from int_mv* to MV*. Rewriting is_inside function to make it much shorter.

Change-Id: Ie4c4cf3eccd46707c7df099ec21fb1b61c72fc7a

a39abe26

Merge "Finally removing all old block size constants." · 3e51acaf
Dmitry Kovalev authored 11 years ago

3e51acaf
Merge "Changing the order switchable filter enum constants." · 4a692e41
Dmitry Kovalev authored 11 years ago

4a692e41
Merge "Removing unused functions." · 25b7dc08
Dmitry Kovalev authored 11 years ago

25b7dc08
Merge "Add variance based mode/skipping" · 33afddad
Deb Mukherjee authored 11 years ago

33afddad

Move fdct32x32 SSE2 implementation in separate file. · 3d98205f

Christian Duvivier authored 11 years ago

This is in preparation for the SSE2 version of the high-precision
32x32 forward DCT which will share a lot of code with the existing
low precision version used for rate-distortion search.

Change-Id: I7084b6bdfb480b1fabb8493fb14e3f7fcc7888c0

3d98205f

intrapred x86inc guards · 25ec1375
Jim Bankoski authored 11 years ago
```
Change-Id: If0399d8e11f4ebe75a5c91abb8d6a52a7709065b
```
25ec1375
block error / x86inc mods · 62c6aa88
Jim Bankoski authored 11 years ago
```
Change-Id: Icb607745634e10b9bac5019d06661ece09fcdb40
```
62c6aa88

reworked config for use_x86_inc · a93b115c

Jim Bankoski authored 11 years ago

Support enabling it or disabling it.  Moved read out to configure.sh
so that its done once instead of in make and in config.

Change-Id: I73a9190cf31de9f03e8a577f478fa522f8c01c8b

a93b115c

05 Aug, 2013 - 14 commits

Merge changes I082959ab,Ib6932640 · d115cd8b

James Zern authored 11 years ago

* changes:
  vp9/decoder: threaded row-based loop filter
  vp9/decoder: add thread worker

d115cd8b

Finally removing all old block size constants. · b9c7d04e
Dmitry Kovalev authored 11 years ago
```
Change-Id: I3aae21e88b876d53ecc955260479980ffe04ad8d
```
b9c7d04e
fixed script problem with config_force_x86_inc · f4837579
Jim Bankoski authored 11 years ago
```
Change-Id: I226e5094d216b09dc47fa5511a66e2d314608000
```
f4837579
Merge "Begin to restrict x86inc.asm usage" · a5a73224
Jim Bankoski authored 11 years ago

a5a73224

Add variance based mode/skipping · 8b3faccb

Deb Mukherjee authored 11 years ago

Adds a speed feature to skip all intra modes other than
DC_PRED if the source variance is small. This feature is
made part of speed 1 and up.

Results on derf300: psnr -0.07%, speedup about 1-2%

Also uses the source variance to fine-tune the early
termination criteria when FLAG_EARLY_TERMINATE is on.
This feature is made part of speed 2 and up.

Results on derf300: psnr -0.52%, speedup about 5-7%

Change-Id: I59e38aa836557cfa5405ae706fc64815cbfe4232

8b3faccb

Merge "cleanups after bw bh code" · 9f988a2e
Jim Bankoski authored 11 years ago

9f988a2e

vp9/decoder: threaded row-based loop filter · a0ffa279

James Zern authored 11 years ago

Currently the only threaded option for vp9 decode. Enabled when the
decoder config thread count is > 1.

Change-Id: I082959abac9e31aa4a38ed9fd68b94680e57f4df

a0ffa279

vp9/decoder: add thread worker · 183b77d5

James Zern authored 11 years ago

vp9/decoder/vp9_thread.[hc]
Original source:
 http://git.chromium.org/webm/libwebp.git
 100644 blob b1615d0fb8d311666b2fa4561076c62d72c2e3ff  src/utils/thread.c
 100644 blob 13a61a4c84194c3374080cbf03d881d3cd6af40d  src/utils/thread.h

Local modifications:
 - s/WebP/VP9/g
 - camelcase functions -> lower with _'s

Change-Id: Ib6932640ee34f8b4782c6fbd15864a59d5d4c5fe

183b77d5

Changing the order switchable filter enum constants. · 3f611555

Dmitry Kovalev authored 11 years ago

This changeset allows to remove vp9_switchable_interp and
vp9_switchable_interp_map arrays and make code much clear. Actually we
still have to use these mapping but only inside read_interp_filter_type and
write_interp_filter_type functions.

Change-Id: I4026c6f8c4acefba6c81421b7bacbaa52cc45f50

3f611555

cleanups after bw bh code · 5d2cb7ea

Jim Bankoski authored 11 years ago

Cons bw/bh parms that should have been const. Additional formatting.

Change-Id: Icd36a5c9dc17dadd7284315ac0d6fef1a565ca16

5d2cb7ea

Update README · 9ab47772

Paweł Hajdan authored 11 years ago

- new date
- add VP9 to the title
- update list of available targets

Change-Id: I56263336db393020bac5da8e42fbac3a276ffb1f

9ab47772

Begin to restrict x86inc.asm usage · c3809f3d

Jim Bankoski authored 11 years ago

Chromium does not support 32bit builds for Mac which use x86inc.asm.
Make the files which include it work if 64bit or not PIC enabled
starting with vp9_copy_sse2.asm

Consolidate these targets in vp9_rtcd_defs.sh

Change-Id: If18f0b957a611efd085a3ee7d245cf1eb91e8248

c3809f3d

Replacing long block size enum values with shorter ones (2). · d007446b
Dmitry Kovalev authored 11 years ago
```
Change-Id: I428c4d42212b757112e3acfe5b81314cfbb5fd6b
```
d007446b
Merge "Cleaning up vp9_build_inter_predictor function." · 319867d7
Dmitry Kovalev authored 11 years ago

319867d7

04 Aug, 2013 - 1 commit
- Merge "Replacing "txfm" with "tx" in identifiers." · 78671e2e
  Dmitry Kovalev authored 11 years ago
  
  78671e2e