Commits · 458f4fedd2d32ca5b7185df44656c4ccba2bae8d · BC / public / external / libvpx

05 Nov, 2010 - 1 commit

improve average framerate calculation · f7e187d3

John Koleszar authored 14 years ago

Change Ice204e86 identified a problem with bitrate undershoot due to
low precision in the timestamps passed to the library. This patch
takes a different approach by calculating the duration of this frame
and passing it to the library, rather than using a fixed duration
and letting the library average it out with higher precision
timestamps. This part of the fix only applies to vpxenc.

This patch also attempts to fix the problem for generic applications
that may have made the same mistake vpxenc did. Instead of
calculating this frame's duration by the difference of this frame's
and the last frame's start time, we use the end times instead. This
allows the framerate calculation to scavenge "unclaimed" time from
the last frame. For instance:

  start |  end  | calculated duration
  ======+=======+====================
    0ms    33ms   33ms
   33ms    66ms   33ms
   66ms    99ms   33ms
  100ms   133ms   34ms

Change-Id: I92be4b3518e0bd530e97f90e69e75330a4c413fc

f7e187d3

01 Nov, 2010 - 1 commit

SSSE3 version of fast quantizer · ff4a71f4

Scott LaVarnway authored 14 years ago

(test clip: tulip)
For good quality mode with speed=1, this gave the encoder
a small (2 - 3%) performance boost.

Change-Id: I8a1d4269465944ac0819986c2f0be4b0a2ee0b35

ff4a71f4

29 Oct, 2010 - 1 commit

Finding first label · dcee88ea

Scott LaVarnway authored 14 years ago

Using tables for the label count and label offset.

Change-Id: Iac3d5b292c37341a881be0af282f5cac3b3e01eb

dcee88ea

28 Oct, 2010 - 4 commits

Save XMM registers in asm functions · 6614563b

Yunqing Wang authored 14 years ago

XMM6/7 are used in these functions, and need to be saved.

Change-Id: I3dfaddaf2a69cd4bf8e8735c7064b17bac5a14e5

6614563b

Fix full-search SAD function crash in Visual Studio · 7e3a1e73

Yunqing Wang authored 14 years ago

Unlike GCC, Visual Studio compiler doesn't allocate SAD output
array 16-byte aligned, which causes crash in visual studio.

Change-Id: Ia755cf5a807f12929bda8db94032bb3c9d0c2362

7e3a1e73

Eliminate more warnings. · 97b766a4

Timothy B. Terriberry authored 14 years ago

This eliminates a large set of warnings exposed by the Mozilla build
 system (Use of C++ comments in ISO C90 source, commas at the end of
 enum lists, a couple incomplete initializers, and signed/unsigned
 comparisons).
It also eliminates many (but not all) of the warnings expose by newer
 GCC versions and _FORTIFY_SOURCE (e.g., calling fread and fwrite
 without checking the return values).
There are a few spurious warnings left on my system:

../vp8/encoder/encodemb.c:274:9: warning: 'sz' may be used
 uninitialized in this function
gcc seems to be unable to figure out that the value shortcut doesn't
 change between the two if blocks that test it here.

../vp8/encoder/onyx_if.c:5314:5: warning: comparison of unsigned
 expression >= 0 is always true
../vp8/encoder/onyx_if.c:5319:5: warning: comparison of unsigned
 expression >= 0 is always true
This is true, so far as it goes, but it's comparing against an enum,
 and the C standard does not mandate that enums be unsigned, so the
 checks can't be removed.

Change-Id: Iead6cd561a2afaa3d801fd63f1d8d58953da7426

97b766a4

Eliminate more warnings. · c4d7e5e6

Timothy B. Terriberry authored 14 years ago

This eliminates a large set of warnings exposed by the Mozilla build
 system (Use of C++ comments in ISO C90 source, commas at the end of
 enum lists, a couple incomplete initializers, and signed/unsigned
 comparisons).
It also eliminates many (but not all) of the warnings expose by newer
 GCC versions and _FORTIFY_SOURCE (e.g., calling fread and fwrite
 without checking the return values).
There are a few spurious warnings left on my system:

../vp8/encoder/encodemb.c:274:9: warning: 'sz' may be used
 uninitialized in this function
gcc seems to be unable to figure out that the value shortcut doesn't
 change between the two if blocks that test it here.

../vp8/encoder/onyx_if.c:5314:5: warning: comparison of unsigned
 expression >= 0 is always true
../vp8/encoder/onyx_if.c:5319:5: warning: comparison of unsigned
 expression >= 0 is always true
This is true, so far as it goes, but it's comparing against an enum, and the C
 standard does not mandate that enums be unsigned, so the checks can't be
 removed.

Change-Id: Iaf689ae3e3d0ddc5ade00faa474debe73b8d3395

c4d7e5e6

27 Oct, 2010 - 3 commits

Full search SAD function optimization in SSE4.1 · 71ecb5d7

Yunqing Wang authored 14 years ago

Use mpsadbw, and calculate 8 sad at once. Function list:
vp8_sad16x16x8_sse4
vp8_sad16x8x8_sse4
vp8_sad8x16x8_sse4
vp8_sad8x8x8_sse4
vp8_sad4x4x8_sse4

(test clip: tulip)
For best quality mode, this gave encoder a 5% performance boost.
For good quality mode with speed=1, this gave encoder a 3%
performance boost.

Change-Id: I083b5a39d39144f88dcbccbef95da6498e490134

71ecb5d7

Fix half-pixel variance RTCD functions · a0ae3682

John Koleszar authored 14 years ago

This patch fixes the system dependent entries for the half-pixel
variance functions in both the RTCD and non-RTCD cases:

  - The generic C versions of these functions are now correct.
    Before all three cases called the hv code.

  - Wire up the ARM functions in RTCD mode

  - Created stubs for x86 to call the optimized subpixel functions
    with the correct parameters, rather than falling back to C
    code.

Change-Id: I1d937d074d929e0eb93aacb1232cc5e0ad1c6184

a0ae3682

Add half-pixel variance RTCD functions · 209d82ad

John Koleszar authored 14 years ago

NEON has optimized 16x16 half-pixel variance functions, but they
were not part of the RTCD framework. Add these functions to RTCD,
so that other platforms can make use of this optimization in the
future and special-case ARM code can be removed.

A number of functions were taking two variance functions as
parameters. These functions were changed to take a single
parameter, a pointer to a struct containing all the variance
functions for that block size. This provides additional flexibility
for calling additional variance functions (the half-pixel special
case, for example) and by initializing the table for all block sizes,
we don't have to construct this function pointer table for each
macroblock.

Change-Id: I78289ff36b2715f9a7aa04d5f6fbe3d23acdc29c

209d82ad

26 Oct, 2010 - 4 commits

make vp8_recon16x16mb{,y} RTCD functions · d6c67f02

John Koleszar authored 14 years ago

ARM NEON has a platform specific version of vp8_recon16x16mb, though
it's just a stub to extract the various parameters from the
MACROBLOCKD struct and pass them to vp8_recon16x16mb_neon(). Using
that function's prototype directly will be a better long term solution,
but it's quite an invasive change.

Change-Id: I04273149e2ade34749e2d09e7edb0c396e1dd620

d6c67f02

make arm hex search the generic implementation · 96cf6588

John Koleszar authored 14 years ago

The ARM version of vp8_hex_search() is a faster implementation
of the same algorithm. Since it doesn't use any ARM specific
code, it can be made the default implementation. This removes
a linking error.

Change-Id: I77d10f2c16b2515bff4522c350004e03b7659934

96cf6588

arm: remove duplicate functions · d330a587

John Koleszar authored 14 years ago

These functions were true duplicates of functions present in the
generic code. This fixes some of the link errors when building
with --enable-shared --enable-pic.

Change-Id: Idff26599d510d954e439207883607ad6b74df20c

d330a587

add missing GET_GOT/RESTORE_GOT pairs · b523dd51

John Koleszar authored 14 years ago

These functions made global references but did not set up the GOT,
causing compilation failures in PIC mode.

Change-Id: Iac473bf46733f87eb2e001cd736af4acf73fa51d

b523dd51

25 Oct, 2010 - 4 commits

Fix leaked file descriptor with ENTROPY_STATS · c3fd2c4e

Martin Ettl authored 14 years ago

cppcheck found a leaked file descriptor in the debugging code
enabled by defining ENTROPY_STATS. Fixes issue #60.

Change-Id: I0c1d0669cb94d44fed77860f97b82763be06b7cb

c3fd2c4e

quiet compiler · 385865f8

Johann authored 14 years ago

clean up compiler warnings, man in the yellow hat warnings, and start to
remove unused #includes

Change-Id: I6267e98d9b3024b6fb1ef2732b29067a33cb96f6

385865f8

Add runtime CPU detection support for ARM. · b71962fd

Timothy B. Terriberry authored 14 years ago

The primary goal is to allow a binary to be built which supports
 NEON, but can fall back to non-NEON routines, since some Android
 devices do not have NEON, even if they are otherwise ARMv7 (e.g.,
 Tegra).
The configure-generated flags HAVE_ARMV7, etc., are used to decide
 which versions of each function to build, and when
 CONFIG_RUNTIME_CPU_DETECT is enabled, the correct version is chosen
 at run time.
In order for this to work, the CFLAGS must be set to something
 appropriate (e.g., without -mfpu=neon for ARMv7, and with
 appropriate -march and -mcpu for even earlier configurations), or
 the native C code will not be able to run.
The ASFLAGS must remain set for the most advanced instruction set
 required at build time, since the ARM assembler will refuse to emit
 them otherwise.
I have not attempted to make any changes to configure to do this
 automatically.
Doing so will probably require the addition of new configure options.

Many of the hooks for RTCD on A...

b71962fd

isolate new temporal filtering code · e81e30c2

Johann authored 14 years ago

onyx_if is getting pretty big. split out the temporal code to make it
easier to look at.

Change-Id: I207c3a94c90e91b32e3ea5e1836a53b7a990fabd

e81e30c2

22 Oct, 2010 - 1 commit

Convert [4][4] matrices to [16] arrays. · 8f75ea6b

Timothy B. Terriberry authored 14 years ago

Most of the code that actually uses these matrices indexes them as
 if they were a single contiguous array, and coverity produces
 reports about the resulting accesses that overflow the static
 bounds of the first row.
This is perfectly legal in C, but converting them to actual [16]
 arrays should eliminate the report, and removes a good deal of
 extraneous indexing and address operators from the code.

Change-Id: Ibda479e2232b3e51f9edf3b355b8640520fdbf23

8f75ea6b

21 Oct, 2010 - 3 commits

Move firstpass motion map to stats packet · bb7dd5b1

John Koleszar authored 14 years ago

The first implementation of the firstpass motion map for motion
compensated temporal filtering created a file, fpmotionmap.stt,
in the current working directory. This was not safe for multiple
encoder instances. This patch merges this data into the first pass
stats packet interface, so that it is handled like the other
(numerical) firstpass stats.

The new stats packet is defined as follows:
    Numerical Stats (16 doubles) -- 128 bytes
    Motion Map                   -- 1 byte / Macroblock
    Padding                      -- to align packet to 8 bytes

The fpmotionmap.stt file can still be generated for debugging
purposes in the same way that the textual version of the stats
are available (defining OUTPUT_FPF in firstpass.c)

Change-Id: I083ffbfd95e7d6a42bb4039ba0e81f678c8183ca

bb7dd5b1

Add MMWORD PTR/XMMWORD PTR in subtract_sse2.asm · 4cefb443
Yunqing Wang authored 14 years ago
```
Change-Id: Ia649b500ef020225d8bbf611799d0f47658dc2ac
```
4cefb443

Rewrite vp8_short_walsh4x4_sse2() · fc94ffce

Yunqing Wang authored 14 years ago

This rewriting reflects changes made in commit "Improve the
accuracy of forward walsh-hadamard transform". Since this function
is not called much, only a small encoder performance gain (~0.5% )
is seen.

Change-Id: Ie9df58a43028a11fd5b115c4bbe3141f7596578b

fc94ffce

18 Oct, 2010 - 2 commits

Add SSE2 subtract functions · 4db20765

Yunqing Wang authored 14 years ago

Instead of doing 8-bit data unpack and 16-bit subtraction, use
psubb to do 16 8-bit subtractions and pcmpgtb to preserve the
sign information. This does not bring noticable gain since
these functions are not called frequently.

Change-Id: I90a0dfaa3db9d422e4ada324076596ffb178548e

4db20765

copy compiler warning fixes · ce1ce992

Johann authored 14 years ago

generic version got fixed, but not the arm version. fixes:
vp8/encoder/arm/mcomp_arm.c: In function 'vp8_full_search_sadx3':
vp8/encoder/arm/mcomp_arm.c:1208: warning: pointer targets in passing
argument 5 of 'fn_ptr->sdx3f' differ in signedness
vp8/encoder/arm/mcomp_arm.c:1208: note: expected 'unsigned int *' but
argument is of type 'int *'

and another unsigned change to keep the files similar

Change-Id: I1b6255dc3a03b90394a791ee0d15d8167d9454db

ce1ce992

15 Oct, 2010 - 2 commits

remove dead code · 963bcd6c

Johann authored 14 years ago

vp8_diamond_search_sadx4 isn't used in arm because there is no
corrosponding sdx4df as in x86. rather than keep it in sync with
../mcomp.c, delete it

vp8_hex_search had the original, more readable/understandable code if`d
out. it's also available in ../mcomp.c, so remove the dead copy

Change-Id: Ia42aa6e23b3a2e88040f467280befec091ec080e

963bcd6c

change to make use of more trellis quantization · 2e53e9e5

Yaowu Xu authored 14 years ago

when a subsequent frame is encoded as an alt reference frame, it is
unlikely that any mb in current frame will be used as reference for
future frames, so we can enable quantization optimization even when
the RD constant is slightly rate-biased. The change has an overall
benefit between 0.1% to 0.2% bit savings on the test sets based on
vpxssim scores.

Change-Id: I9aa7bc5cd573ea84e3ee655d2834c18c4460ceea

2e53e9e5

14 Oct, 2010 - 3 commits

safety check to avoid divide by 0s · 39f41a4f
Jim Bankoski authored 14 years ago

39f41a4f

Improve bounds checking in vp8_diamond_search_sadx4() · d6da7b8e

Yunqing Wang authored 14 years ago

In order to know if all 4/8 neighbor points are within the bounds,
4 bounds checking are enough instead of checking 4 bounds for
each points (16/32 checkings). This improvement reduces cost of
vp8_diamond_search_sadx4() by 30%, and gives encoder a 1.5%
performance gain (test options: 1 pass, good, speed=4).

Change-Id: Ie8da29d18a6ecfc9829e74ac02f6fa70e042331a

d6da7b8e

Fix compiler warning about vp8_fast_quantize_b_impl_ssse2. · 1dc0ca13

Fritz Koenig authored 14 years ago

Typo had function defined as _ssse2 and prototyped as _sse2.

Change-Id: If9f19da1a83cff40774a90cf936d601c0bf1b7fe

1dc0ca13

13 Oct, 2010 - 1 commit

Correct QWORD usage in assembly files · 92df4a06

Fritz Koenig authored 14 years ago

QWORD was being undefined because it was being used
incorrectly.

Change-Id: I3610cefa3d6f0da4054316760f78b9694cde3876

92df4a06

12 Oct, 2010 - 2 commits

Centralize mb skip state calculation · 13685747

John Koleszar authored 14 years ago

This patch moves the scattered updates to the mb skip state
(mode_info_context->mbmi.mb_skip_coeff) to vp8_tokenize_mb. Recent
changes to the quantizer exposed a bug where if a macroblock
could be coded as a skip but isn't, the encoder would run the
loopfilter but the decoder wouldn't, causing a reference buffer
mismatch.

The loopfilter is controlled by a flag called dc_diff. The decoder
looks at the number of decoded coefficients when setting this flag.
The encoder sets this flag based on the skip state, since any
skippable macroblock should be transmitted as a skip. The coefficient
optimization pass (vp8_optimize_b()) could change the coefficients
such that a block that was not a skip becomes one. The encoder was
not updating the skip state in this situation for intra coded blocks.

The underlying issue predates it, but this bug was recently triggered
by enabling trellis quantization on the Y2 block in commit dcd29e36,
and by changing the quantizer range control in commit 305be4e4.

Change-Id: I5cce5da0dbc2d22f7d79ee48149f01e868a64802

13685747

Add const qualifiers to variance/SAD functions. · f4a85944

Timothy B. Terriberry authored 14 years ago

These functions should never change their input, and there's no
 reason not to declare that.
This allows them to be passed static const data.

Change-Id: Ia49fe4b01e80e9afcb24b4844817694d4da5995c

f4a85944

11 Oct, 2010 - 2 commits

Move vp8_strict_quantize_b inside EXACT_QUANT #define. · 82c43398

Timothy B. Terriberry authored 14 years ago

There is currently no inexact version of this function, so do not
 even compile it without EXACT_QUANT.
This will prevent someone from inadvertently trying to use it without
 the proper EXACT_QUANT setup.

Change-Id: Ia13491e0128afb281c05c9222ee5987101e4010d

82c43398

Remove INTRARDOPT #define and intra_rd_opt option. · dd08db93

Timothy B. Terriberry authored 14 years ago

This is just eliminating some cruft.
Although a number of variables are declared only when INTRARDOPT
 is defined, they are used elsewhere without that protection, and
 no longer just for intra RDO.
The intra_rd_opt flag was hard-coded to 1 and never checked.

Change-Id: I83a81554ecee8053e7b4ccd8aa04e18fa60f8e4f

dd08db93

07 Oct, 2010 - 2 commits

Remove unused file in encoder · 7e6f7b57

Yunqing Wang authored 14 years ago

Remove vp8/encoder/x86/csystemdependent.c

Change-Id: I7c590dcd07b68704d463a1452f62f29ffb1402f4

7e6f7b57

Added vp8_fast_quantize_b_sse2 · d860f685

Scott LaVarnway authored 14 years ago

Moved vp8_fast_quantize_b_sse from quantize_mmx.asm into
quantize_sse2.asm and renamed.  Updated the assembly code to
match the C version.

Change-Id: I1766d9e1ca60e173f65badc0ca0c160c2b51b200

d860f685

06 Oct, 2010 - 1 commit

optimize fast_quantizer c version · d338d14c

Yaowu Xu authored 14 years ago

As the zbin and rounding constants are normalized, rounding effectively
does the zbinning, therefore the zbin operation can be removed. In
addition, the memset on the two arrays are no longer necessary.

Change-Id: If39c353c42d7e052296cb65322e5218810b5cc4c

d338d14c

04 Oct, 2010 - 2 commits

nasm: address labels 'rel label' vice 'wrt rip' · 5cdc3a4c

Jan Kratochvil authored 14 years ago

nasm does not support `label wrt rip', it requires `rel label'. It is
still fully compatible with yasm.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.

Change-Id: I488773a4e930a56e43b0cc72d867ee5291215f50

5cdc3a4c

nasm: match instruction length (movd/movq) to parameters · e114f699

Jan Kratochvil authored 14 years ago

nasm requires the instruction length (movd/movq) to match to its
parameters. I find it more clear to really use 64bit instructions when
we use 64bit registers in the assembly.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.

Change-Id: Id9b1a5cdfb1bc05697e523c317a296df43d42a91

e114f699

02 Oct, 2010 - 1 commit

Tune effect of motion on KF/GF boost in two pass; · 788c0eb5

Paul Wilkins authored 14 years ago

This code adjust the impact of the amount and speed of motion
on GF and KF boost.

Sections with lots of slow motion will tend to have a
somewhat bigger boost and sections with fast motion may
have less.

There is a knock on effect to the selection of the active
quantizer range.

This will likely require further tuning but helps with a couple
of particularly bad edge cases.

Change-Id: Ic2449cda7305672b69acf42fc0a845b77ac98d40

788c0eb5