Commits · f88558fb1d609ea9f6fb763d61374defb70bc275 · BC / public / external / libvpx

31 Oct, 2012 - 1 commit
- Change encoder vp8_ and vp8cx_ public symbol prefixes to vp9_. · f88558fb
  Ronald S. Bultje authored 12 years ago
```
Change-Id: Ie2e3652591b010ded10c216501ce24fd95d0aec5
```
  f88558fb
29 Oct, 2012 - 1 commit

Jim Bankoski authored 12 years ago

Remove the fdct invoke macro calls

Change-Id: Ica2431c655819fa012133ee7abc75a16761e5fd6

818ee904

08 Aug, 2012 - 1 commit

Merging in the sixteenth subpel uv experiment · 7d065653

Deb Mukherjee authored 12 years ago

Merges this experiment in to make it easier to run tests on
filter precision, vectorized implementation etc.

Also removes an experimental filter.

Change-Id: I1e8706bb6d4fc469815123939e9c6e0b5ae945cd

7d065653

17 Jul, 2012 - 1 commit

Restyle code · c6b9039f

John Koleszar authored 12 years ago

Approximate the Google style guide[1] so that that there's a written
document to follow and tools to check compliance[2].

[1]: http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml
[2]: http://google-styleguide.googlecode.com/svn/trunk/cpplint/cpplint.py

Change-Id: Idf40e3d8dddcc72150f6af127b13e5dab838685f

c6b9039f

15 Mar, 2012 - 1 commit

WebM Experimental Codec Branch Snapshot · 6035da54

Yaowu Xu authored 13 years ago

This is a code snapshot of experimental work currently ongoing for a
next-generation codec.

The codebase has been cut down considerably from the libvpx baseline.
For example, we are currently only supporting VBR 2-pass rate control
and have removed most of the code relating to coding speed, threading,
error resilience, partitions and various other features.  This is in
part to make the codebase easier to work on and experiment with, but
also because we want to have an open discussion about how the bitstream
will be structured and partitioned and not have that conversation
constrained by past work.

Our basic working pattern has been to initially encapsulate experiments
using configure options linked to #IF CONFIG_XXX statements in the
code. Once experiments have matured and we are reasonably happy that
they give benefit and can be merged without breaking other experiments,
we remove the conditional compile statements and merge them in.

Current changes include:
* Temporal coding experiment for segments (though still only 4 max, it
  will likely be increased).
* Segment feature experiment - to allow various bits of information to
  be coded at the segment level. Features tested so far include mode
  and reference frame information, limiting end of block offset and
  transform size, alongside Q and loop filter parameters, but this set
  is very fluid.
* Support for 8x8 transform - 8x8 dct with 2nd order 2x2 haar is used
  in MBs using 16x16 prediction modes within inter frames.
* Compound prediction (combination of signals from existing predictors
  to create a new predictor).
* 8 tap interpolation filters and 1/8th pel motion vectors.
* Loop filter modifications.
* Various entropy modifications and changes to how entropy contexts and
  updates are handled.
* Extended quantizer range matched to transform precision improvements.

There are also ongoing further experiments that we hope to merge in the
near future: For example, coding of motion and other aspects of the
prediction signal to better support larger image formats, use of larger
block sizes (e.g. 32x32 and up) and lossless non-transform based coding
options (especially for key frames). It is our hope that we will be
able to make regular updates and we will warmly welcome community
contributions.

Please be warned that, at this stage, the codebase is currently slower
than VP8 stable branch as most new code has not been optimized, and
even the 'C' has been deliberately written to be simple and obvious,
not fast.

The following graphs have the initial test results, numbers in the
tables measure the compression improvement in terms of percentage. The
build has  the following optional experiments configured:
--enable-experimental --enable-enhanced_interp --enable-uvintra
--enable-high_precision_mv --enable-sixteenth_subpel_uv

CIF Size clips:
http://getwebm.org/tmp/cif/
HD size clips:
http://getwebm.org/tmp/hd/
(stable_20120309 represents encoding results of WebM master branch
build as of commit#7a159071)

They were encoded using the following encode parameters:
--good --cpu-used=0 -t 0 --lag-in-frames=25 --min-q=0 --max-q=63
--end-usage=0 --auto-alt-ref=1 -p 2 --pass=2 --kf-max-dist=9999
--kf-min-dist=0 --drop-frame=0 --static-thresh=0 --bias-pct=50
--minsection-pct=0 --maxsection-pct=800 --sharpness=0
--arnr-maxframes=7 --arnr-strength=3(for HD,6 for CIF)
--arnr-type=3

Change-Id: I5c62ed09cfff5815a2bb34e7820d6a810c23183c

6035da54

23 Feb, 2012 - 1 commit

Supporting high precision 1/8-pel motion vectors · 18e90d74

Deb Mukherjee authored 13 years ago

This is the initial patch for supporting 1/8th pel
motion. Currently if we configure with enable-high-precision-mv,
all motion vectors would default to 1/8 pel. Encode and
decode syncs fine with the current code. In the next phase
the code will be refactored so that we can choose the 1/8
pel mode adaptively at a frame/segment/mb level.

Derf results:
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hpmv.html
(about 0.83% better than 8-tap interpoaltion)

Patch 3: Rebased. Also adding 1/16th pel interpolation for U and V

Patch 4: HD results.
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hd_hpmv.html
Seems impressive (unless I am doing something wrong).

Patch 5: Added mmx/sse for bilateral filtering, as well as enforced
use of c-versions of subpel filters with 8-taps and 1/16th pel;
Also redesigned the 8-tap filters to reduce the cut-off in order to
introduce a denoising effect. There is a new configure option
sixteenth-subpel-uv whic...

18e90d74

16 Feb, 2012 - 1 commit

Code simplification · 79d330d7

Paul Wilkins authored 13 years ago

Removal of the pickinter.c and .h files and calls to this
code.

Removal of some code relating to real time and one pass
settings  though there is more to be done in this regard.

However,  vp8_set_speed_features() now
only supports modes 0 and 1 and speeds up to 3
so rd should always be set.

Change-Id: I62c0c1b6154ab499785baef310536080e87bc4d8

79d330d7

22 Sep, 2011 - 1 commit
- Replace vpx_ports/config.h with vpx_config.h · 1a7d25a4
  Attila Nagy authored 13 years ago
```
Just a clean-up.

Change-Id: Iea5b6dc925dcfa7db548bc1ab1a13d26ed5a2c9a
```
  1a7d25a4
20 Sep, 2011 - 3 commits

Move neon only arm functions under arm/neon. · bd0c3409

Fritz Koenig authored 13 years ago

These files don't contain generic arm code, so should
only be compiled by neon.

Change-Id: Ie712823aa04d4235e7cfe7a3b725e73ee4c3e564

bd0c3409

NEON FDCT updated to match current C code · 0c2529a8

Tero Rintaluoma authored 13 years ago

- Removed fast_fdct4x4_neon and fast_fdct8x4_neon
- Uses now short_fdct4x4 and short_fdct8x4
- Gives ~1-2% speed-up on Cortex-A8/A9

Change-Id: Ib62f2cb2080ae719f8fa1d518a3a5e71278a41ec

0c2529a8

Fixed armv5te multiplications · 3c19bc3f

Tero Rintaluoma authored 13 years ago

Rd and Rm registers should be different in 'mul'. This register
combination results in unpredictable behaviour. GCC will give
a warning and RVCT an error in this case.

Restriction applies only to armv5 targets and not for armv6 and above.

Change-Id: I378d17c51e1f16a6820814fbed43e115aaabb03e

3c19bc3f

19 Sep, 2011 - 2 commits

Updated ARMv6 forward transforms to match C · 4c3ad66b

Tero Rintaluoma authored 13 years ago

- Updated walsh transform to match C
  (based on Change Id24f3392)
- Changed fast_fdct4x4 and 8x4 to short_fdct4x4 and 8x4
  correspondingly

Change-Id: I704e862f40e315b0a79997633c7bd9c347166a8e

4c3ad66b

NEON walsh transform updated to match C · 2a4b2a00

Tero Rintaluoma authored 13 years ago

Modified original patch If2f07220885c4c3a0cae0dace34ea0e36124f001
according to comments. Scheduled code a little bit to prevent some
interlocks.

Change-Id: I338f02b881098782f82af63d97f042b85e63e902

2a4b2a00

29 Jun, 2011 - 1 commit
- clean up warnings when building arm with rtcd · 6611f669
  Johann authored 13 years ago
```
Change-Id: I3683cb87e9cb7c36fc22c1d70f0799c7c46a21df
```
  6611f669
09 Jun, 2011 - 1 commit

remove one set of 16x16 variance funcations · 361717d2

Yaowu Xu authored 13 years ago

call to this set of functions are replaced by var16x16.

Change-Id: I5ff1effc6c1358ea06cda1517b88ec28ef551b0d

361717d2

06 Jun, 2011 - 1 commit

remove redundant functions · d4700731

Yaowu Xu authored 13 years ago

The encoder defined about 4 set of similar functions to calculate sum,
variance or sse or a combination of them. This commit removed one set
of these functions, get8x8var and get16x16var, where calls to the later
function are replaced with var16x16 by using the fact on a 16x16 MB:
    variance == sse - sum*sum/256

Change-Id: I803eabd1fb3ab177780a40338cbd596dffaed267

d4700731

01 Jun, 2011 - 1 commit

neon fast quantize block pair · 61f0c090

Tero Rintaluoma authored 13 years ago

vp8_fast_quantize_b_pair_neon function added to quantize
two adjacent blocks at the same time to improve performance.
 - Additional 3-6% speedup compared to neon optimized fast
   quantizer (Tanya VGA@30fps, 1Mbps stream, cpu-used=-5..-16)

Change-Id: I3fcbf141e5d05e9118c38ca37310458afbabaa4e

61f0c090

30 May, 2011 - 1 commit

adds preload for armv6 encoder asm · 5305e79e

Tero Rintaluoma authored 13 years ago

Added preload instructions to armv6 encoder optimizations.
About 5% average speed-up on Tegra2 for VGA@30fps sequence.

Change-Id: I41d74737720fb71ce7a316f07555357822f3347e

5305e79e

25 May, 2011 - 1 commit
- Return sse value in vp8_variance SSE2 functions · b6679879
  Yunqing Wang authored 13 years ago
```
Minor modification.

Change-Id: I09511d38fd1451d5c4106a48acdb3f766ce59cb7
```
  b6679879
06 May, 2011 - 1 commit

neon fast quantizer updated · 33fa7c4e

Tero Rintaluoma authored 13 years ago

vp8_fast_quantize_b_neon function updated and further optimized.
 - match current C implementation of fast quantizer
 - updated to use asm_enc_offsets for structure members
 - updated ads2gas scripts to handle alignment issues

Change-Id: I5cbad9c460ad8ddb35d2970a8684cc620711c56d

33fa7c4e

01 Apr, 2011 - 1 commit

Wrapper function removed from vp8_subtract_b_neon function call · cec76a36

Tero Rintaluoma authored 14 years ago

Address calculations moved from encodemb_arm.c file to neon
optimized assembly function to save cycles in function calls.
 - vp8_subtract_b_neon_func replaced with vp8_subtract_b_neon
   that contains all needed address calculations
 - unnecessary file encodemb_arm.c removed
 - consistent with ARMv6 optimized version

Change-Id: I6cbc1a2670b56c2077f59995fcf8f70786b4990b

cec76a36

29 Mar, 2011 - 1 commit

ARMv6 optimized subtract functions · 6fdc9aa7

Tero Rintaluoma authored 14 years ago

Adds following ARMv6 optimized functions to encoder:
  - vp8_subtract_b_armv6
  - vp8_subtract_mby_armv6
  - vp8_subtract_mbuv_armv6

Gives 1-5% speed-up depending on input sequence and encoding
parameters. Functions have one stall cycle inside the loop body
on Cortex pipeline.

Change-Id: I19cca5408b9861b96f378e818eefeb3855238639

6fdc9aa7

28 Mar, 2011 - 1 commit

Half pixel variance further optimized for ARMv6 · f5e43346

Tero Rintaluoma authored 14 years ago

Half pixel interpolations optimized in variance calculations. Separate
function calls to vp8_filter_block2d_bil_x_pass_armv6 are avoided.On
average, performance improvement is 6-7% for VGA@30fps sequences.

Change-Id: Idb5f118a9d51548e824719d2cfe5be0fa6996628

f5e43346

21 Mar, 2011 - 1 commit

ARMv6 optimized fdct4x4 · a61785b6

Tero Rintaluoma authored 14 years ago

Optimized fdct4x4 (8x4) for ARMv6 instruction set.
  - No interlocks in Cortex-A8 pipeline
  - One interlock cycle in ARM11 pipeline
  - About 2.16 times faster than current C-code compiled with -O3

Change-Id: I60484ecd144365da45bb68a960d30196b59952b8

a61785b6

15 Mar, 2011 - 1 commit
- Add vp8_variance8x8_armv6 and vp8_sub_pixel_variance8x8_armv6 functions · 71bcd9f1
  Attila Nagy authored 14 years ago
```
Change-Id: I08edaffc62514907fa5e90e1689269e467c857f5
```
  71bcd9f1
14 Mar, 2011 - 1 commit
- Add vp8_mse16x16_armv6 function · e54dcfe8
  Attila Nagy authored 14 years ago
```
Change-Id: I77e9f2f521a71089228f96e2db72524189364ffb
```
  e54dcfe8
11 Mar, 2011 - 1 commit

ARMv6 optimized quantization · 7ab08e1f

Tero Rintaluoma authored 14 years ago

Adds new ARMv6 optimized function vp8_fast_quantize_b_armv6
to the encoder.

Change-Id: I40277ec8f82e8a6cbc453cf295a0cc9b2504b21e

7ab08e1f

23 Feb, 2011 - 1 commit

ARMv6 optimized half pixel variance calculations · 8ae92aef

Tero Rintaluoma authored 14 years ago

Adds following ARMv6 optimized functions to the encoder:
 - vp8_variance_halfpixvar16x16_h_armv6
 - vp8_variance_halfpixvar16x16_v_armv6
 - vp8_variance_halfpixvar16x16_hv_armv6

Change-Id: I1e9c2af7acd2a51b72b3845beecd990db4bebd29

8ae92aef

18 Feb, 2011 - 1 commit
- remove unused vp8_predict_dc function · 3ed8fe87
  John Koleszar authored 14 years ago
```
Change-Id: I64fa47889c54cfed094a674c49ef0996d49bdd42
```
  3ed8fe87
11 Feb, 2011 - 1 commit

ARMv6 optimized sad16x16 · 1ef86980

Tero Rintaluoma authored 14 years ago

Adds a new ARMv6 optimized function vp8_sad16x16_armv6 to encoder.

Change-Id: Ibbd7edb8b25cb7a5b522d391b1e9a690fe150e57

1ef86980

10 Feb, 2011 - 1 commit

Fix relative include paths · 02321de0

John Koleszar authored 14 years ago

Allow compiling without adding vp8/{common,encoder,decoder} to the
include paths.

Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c

02321de0

09 Feb, 2011 - 1 commit

Adds armv6 optimized variance calculation · cb14764f

Tero Rintaluoma authored 14 years ago

Adds vp8_sub_pixel_variance16x16_armv6 function to encoder. Integrates
ARMv6 optimized bilinear interpolations from vp8/common/arm/armv6
and adds new assembly file for variance16x16 calculation.
 - vp8_filter_block2d_bil_first_pass_armv6   (integrated)
 - vp8_filter_block2d_bil_second_pass_armv6  (integrated)
 - vp8_variance16x16_armv6 (new)
 - bilinearfilter_arm.h (new)
Change-Id: I18a8331ce7d031ceedd6cd415ecacb0c8f3392db

cb14764f

08 Feb, 2011 - 1 commit

clarify *_offsets.asm differences · 40dcae9c

Johann authored 14 years ago

it's difficult to mux the *_offsets.c files because of header conflicts.
make three instead, name them consistently and partititon the contents
to allow building them as required.

Change-Id: I8f9768c09279f934f44b6c5b0ec363f7943bb796

40dcae9c

28 Jan, 2011 - 1 commit

Adds "armvX-none-rvct" targets · 11a222f5

Tero Rintaluoma authored 14 years ago

Adds following targets to configure script to support RVCT compilation
without operating system support (for Profiler or bare metal images).
 - armv5te-none-rvct
 - armv6-none-rvct
 - armv7-none-rvct

To strip OS specific parts from the code "os_support"-config was added
to script and CONFIG_OS_SUPPORT flag is used in the code to exclude OS
specific parts such as OS specific includes and function calls for
timers and threads etc. This was done to enable RVCT compilation for
profiling purposes or running the image on bare metal target with
Lauterbach.

Removed separate AREA directives for READONLY data in armv6 and neon
assembly files to fix the RVCT compilation. Otherwise
"ldr <reg>, =label" syntax would have been needed to prevent linker
errors. This syntax is not supported by older gnu assemblers.

Change-Id: I14f4c68529e8c27397502fbc3010a54e505ddb43

11a222f5

27 Jan, 2011 - 1 commit

clean up implicit declaration warnings for neon · 27000ed6

Johann authored 14 years ago

Change-Id: I6ca2d89f355839c4c770773c09fc69dcea7c1406
warning: implicit declaration of function
  'vp8_variance_halfpixvar16x16_[h|v|hv]_neon'
  'vp8_sub_pixel_variance16x16_neon_func'

27000ed6

25 Jan, 2011 - 2 commits

move new neon subpixel function · 2168a944

Johann authored 14 years ago

previously wasn't guarded with ifdef ARMV7, causing a link error with
ARMV6

Change-Id: I0526858be0b5f49b2bf11e9090180b2a6c48926d

2168a944

Fix issue 262, vp8cx_pack_tokens_into_partitions_armv5 · 3bf235a4

Attila Nagy authored 14 years ago

http://code.google.com/p/webm/issues/detail?id=262
Function was asuming that partitions have equal amount of mb_rows,
which is not always true.

Change-Id: I59ed40117fd408392a85c633beeb5340ed2f4b25

3bf235a4

18 Jan, 2011 - 1 commit

Modify calling of NEON code in sub-pixel search · ce6c954d

Yunqing Wang authored 14 years ago

In vp8_find_best_sub_pixel_step_iteratively(), many times xoffset
and yoffset are specific values - (4,0) (0,4) and (4,4). Modified
code to call simplified NEON version at these specific offsets to
help with the performance.

Change-Id: Iaf896a0f7aae4697bd36a49e182525dd1ef1ab4d

ce6c954d

28 Dec, 2010 - 1 commit

Use the fast quantizer for inter mode selection · 516ea846

Scott LaVarnway authored 14 years ago

Use the fast quantizer for inter mode selection and the
regular quantizer for the rest of the encode for good quality,
speed 1.  Both performance and quality were improved.  The
quality gains will make up for the quality loss mentioned in
I9dc089007ca08129fb6c11fe7692777ebb8647b0.

Change-Id: Ia90bc9cf326a7c65d60d31fa32f6465ab6984d21

516ea846

14 Dec, 2010 - 1 commit

shrink TOKENEXTRA and vp8_extra_bit_struct · 825adc46

Johann authored 14 years ago

Per John's previous change, shrink TOKENEXTRA from 20 to 8 bytes
original: b7b1e6fb
reverted: 41f4458a

Also drop unused field from vp8_extra_bit_struct

Update ARM ASM to deal with this change. In particular, Extra is signed
and needs to be sign-extended when loaded.

Change-Id: Ibd0ddc058432bc7bb09222d6ce4ef77e93a30b41

825adc46