Commits · 68cf24310b1bc407948fe6130732d066a5c02e7d · BC / public / external / libvpx

23 Jul, 2010 - 3 commits

Rate control bug with long key frame interval. · 9404c7db

Paul Wilkins authored 14 years ago

In two pass encodes, the calculation of the number of bits
allocated to a KF group had the potential to overflow for high data
rates if the interval is very long.

We observed the problem in one test clip where there was one
section where there was an 8000 frame gap between key frames.

Change-Id: Ic48eb86271775d7573b4afd166b567b64f25b787

9404c7db

Make the quantizer exact. · e04e2935

Timothy B. Terriberry authored 14 years ago

This replaces the approximate division-by-multiplication in the
 quantizer with an exact one that costs just one add and one
 shift extra.
The asm versions have not been updated in this patch, and thus
 have been disabled, since the new method requires different
 multipliers which are not compatible with the old method.

Change-Id: I53ac887af0f969d906e464c88b1f4be69c6b1206

e04e2935

80 character line length on Arnr LUT · d576690b

Paul Wilkins authored 14 years ago

Tweaked table to fit to 80 characters.

Change-Id: Ie6ba80e0b31b33e23d2bf78599abe223369fcefb

d576690b

22 Jul, 2010 - 1 commit

Remove CONFIG_NEW_TOKENS files. · 08eed049

Fritz Koenig authored 14 years ago

These files were out of date and no longer maintained.
Token decoding has implemented the no-crash code which
is incompatible with this arm assembly code.

Change-Id: Ibf729886c56fca48181af60b44bda896c30023fc

08eed049

19 Jul, 2010 - 3 commits

ARNR Lookup Table. · 0ba32632

Paul Wilkins authored 14 years ago

Change submitted for Adrian Grange. Convert threshold
calculation in ARNR filter to a lookup table.

Change-Id: I12a4bbb96b9ce6231ce2a6ecc2d295610d49e7ec

0ba32632

Parameter limit change. · 02277b8a

Paul Wilkins authored 14 years ago

Change maximum ARNR filter width to 15.

Change-Id: I3b72450ea08e96287445ec18810630ee2292954c

02277b8a

Rate control fix for ARNR filtered frames. · bf18069c

Paul Wilkins authored 14 years ago

Previously we had assumed that it was necessary to give a full frame's
bit allocation to the alt ref frame if it has been created through temporal
filtering. This is not the case. The active max quantizer control
insures that sufficient bits are allocated if needed and allocating a
full frame's worth of bits creates an excessive overhead for the ARF.

Change-Id: I83c95ed7bc7ce0e53ccae6ff32db5a97f145937a

bf18069c

16 Jul, 2010 - 1 commit
- Fix: Incorrect 'cols' calculation in temporal filter. · 7c938f4d
  Paul Wilkins authored 14 years ago
```
Change-Id: I37f10fbe4fbb505c1d34980a59af3e817c287e22
```
  7c938f4d
12 Jul, 2010 - 1 commit
- limit range checking code for L[k] to CONFIG_DEBUG. patch by timeless@gmail.com · 80f0e7a7
  Michael Kohler authored 14 years ago
  
  80f0e7a7
07 Jul, 2010 - 2 commits
- Fix a compiling error on armv6 · 3d0a1eda
  Yaowu Xu authored 14 years ago
```
The issue was caused by a bad merge in Change I5559d1e8

Change-Id: I6563f652bc1500202de361f8f51d11cc6ddf3331
```
  3d0a1eda
- Fix misspelled "skiped" in onyxc_int.h to "skipped". · 1e23f451
  Michael Kohler authored 14 years ago
```
Signed-off-by: Michael Kohler <michaelkohler@live.com>
```
  1e23f451
01 Jul, 2010 - 1 commit

Fix bug in 1st pass motion compensation · 0618ff14

Adrian Grange authored 14 years ago

In the case where the best reference mv is not (0,0) a secondary
search is carried out centered on (0,0). However, rather than
sending tmp_err into the search function, motion_error was
inadvertently passed.

As a result tmp_err remains set at INT_MAX and the (0,0)-based
search result will never be selected, even if it is better.

Change-Id: I3c82b246c8c82ba887b9d3fb4c9e0a0f2fe5a76c

0618ff14

30 Jun, 2010 - 2 commits

Update loopfilter frame/filter/sharp info for multithread · 308e867f

John Koleszar authored 14 years ago

Change I9fd1a5a4 updated the multithreaded loopfilter to avoid
reinitializing several parameteres if they haven't changed from the
last frame, but the code to update the last frame's parameters wasn't
invoked in the multithreaded case.

Change-Id: Ia23d937af625c01dd739608e02d110f742b7e1f2

308e867f

Add loopfilter initialization fix in multithreading code · 29d586b4

Yunqing Wang authored 14 years ago

Modified loopfilter initialization to avoid unnecessary operations.

Change-Id: I9fd1a5a49edc1cb8116c2a72a6908b1e437459ec

29d586b4

29 Jun, 2010 - 3 commits

Improve SSE2 loopfilter functions · bead039d

Yunqing Wang authored 14 years ago

Restructured and rewrote SSE2 loopfilter functions. Combined u and
v into one function to take advantage of SSE2 128-bit registers.
Tests on test clips showed a 4% decoder performance improvement on
Linux desktop.

Change-Id: Iccc6669f09e17f2224da715f7547d6f93b0a4987

bead039d

Further adjustment of RD behaviour with Q and Zbin. · 1ca39bf2

Paul Wilkins authored 14 years ago

Following conversations with Tim T (Derf) I ran a large number of
tests comparing the existing polynomial expression with a simpler
^2 variant. Though the polynomial was sometimes a little better at
the extremes of Q it was possible to get close for most clips and
even a little better on some.

This code also changes the way the RD multiplier is calculated
when the ZBIN is extended to use a variant of the same ^2
expression.

I hope that this simpler expression will be easier to tune further
as we expand our test set and consider adjustments based on content.

Change-Id: I73b2564346e74d1332c33e2c1964ae093437456c

1ca39bf2

Improve the accuracy of forward walsh-hadamard transform · b62d093e

Yaowu Xu authored 14 years ago

Besides the slight improvement in round trip error. This
also fixes a sign bias in the forward transform, so the
round trip errors are evenly distributed between +1s and
-1s. The old bias seemed to work well with the dc sign bias
in old fdct,  which no longer exist in the improved fdct.

Change-Id: I8635e7be16c69e69a8669eca5438550d23089cef

b62d093e

28 Jun, 2010 - 1 commit

Fixed buffer selection for UV in AltRef filtering · aa8fe0d2

Adrian Grange authored 14 years ago

Corrected setting of "which_buffer" for U & V cases to match that
used for Y, i.e. to refer to the temporally most recent frame of
those to be filtered.

Change-Id: Idf94b287ef47a05f060da3e61134a0b616adcb6b

aa8fe0d2

24 Jun, 2010 - 5 commits

Added first-pass sse2 version of Yaowu's new fdct. · f1a3b1e0
Scott LaVarnway authored 14 years ago
```
Change-Id: Ib479210067510162879c368428b92690591120b2
```
f1a3b1e0

Redo the forward 4x4 dct · d0dd01b8

Yaowu Xu authored 14 years ago

The new fdct lowers the round trip sum squared error for a
4x4 block ~0.12. or ~0.008/pixel. For reference, the old
matrix multiply version has average round trip error 1.46
for a 4x4 block.

Thanks to "derf" for his suggestions and references.

Change-Id: I5559d1e81d333b319404ab16b336b739f87afc79

d0dd01b8

vp8cx : bestsad declared and initialized incorrectly. · a5906668

Fritz Koenig authored 14 years ago

bestsad needs to be a int and set to INT_MAX because at the end
of the function it is compared to INT_MAX to determine if there
was a match in the function.

Change-Id: Ie80e88e4c4bb4a1ff9446079b794d14d5a219788

a5906668

vp8cx : bestsad declared and initialized incorrectly. · cecdd73d

Fritz Koenig authored 14 years ago

bestsad should be an int initialized to INT_MAX. The optimized
SAD function expects a signed value for bestsad to use for comparison
and early loop termination. When no match is made, which is
determined by a comparison of bestsad to INT_MAX, INT_MAX is returned.

cecdd73d

Remove INLINE/FORCEINLINE · 5e344614

John Koleszar authored 14 years ago

These are mostly vestigial, it's up to the compiler to decide what
should be inlined, and this collided with certain Windows platform SDKs.

Change-Id: I80dd35de25eda7773156e355b5aef8f7e44e179b

5e344614

21 Jun, 2010 - 4 commits

Fix breakout thresh computation for golden & AltRef frames · a08df455

agrange authored 14 years ago

1. Unavailability of each reference frame type should be tested
independently,
2. Also, only the VP8_GOLD_FLAG needs to be tested before setting
golden frame specific thresholds, and only VP8_ALT_FLAG needs
testing before setting thresholds relevant to the AltRef frame.
(Raised by gbvalor, in response to Issue 47)

Change-Id: I6a06fc2a6592841d85422bc1661e33349bb6c3b8

a08df455

Changed unary operator from ! to ~ · daa5d0eb

agrange authored 14 years ago

Since the intent is
to reset the appropriate bit in ref_frame_flags not to
test a logic condition. Prior result would always have
been ref_frame_flags being set to 0.
(Issue reported by dgohman, issue 47)

Change-Id: I2c12502ed74c73cf38e98c9680e0249c29e16433

daa5d0eb

Moved DOUBLE_DIVIDE_CHECK to denominator (was on numerator) · d4b99b8e

agrange authored 14 years ago

The DOUBLE_DIVIDE_CHECK macro prevents from divide by 0,
so must be on the denominator to work as intended.

Change-Id: Ie109242d52dbb9a2c4bc1e11890fa51b5f87ffc7

d4b99b8e

Fix a linker error on x86-64 Linux when not using a version script. · 9f814634

Timothy B. Terriberry authored 14 years ago

If the version script produced by the libvpx build system is not
 used when linking a shared library on x86-64 Linux, the constant
 data in the subpel filters produces R_X86_64_32 relocation errors
 due to the use of wrt rip addressing instead of
 wrt rip wrt ..gotpcrel.
Instead of adding a new macro for this addressing mode, this patch
 sets the ELF visibility of these symbols to "hidden", which
 allows wrt rip addressing to work without a text relocation.
This allows building a shared library without using the provided
 build system or a separate version script.
Fixes http://code.google.com/p/webm/issues/detail?id=46

Change-Id: Ie108f9d9a4352e5af46938bf4750d2302c1b2dc2

9f814634

18 Jun, 2010 - 2 commits

vp8_block_error_xmm: remove unnecessary instructions · 220daa00

Jim Bankoski authored 14 years ago

Remove a couple instructions from this function which weren't
necessary for correct execution.

Change-Id: Ib649674f140689f7e5c1530c35686241688a3151

220daa00

cosmetics: trim trailing whitespace · 94c52e4d

John Koleszar authored 14 years ago

When the license headers were updated, they accidentally contained
trailing whitespace, so unfortunately we have to touch all the files
again.

Change-Id: I236c05fade06589e417179c0444cb39b09e4200d

94c52e4d

16 Jun, 2010 - 1 commit

Change bitreader to use a larger window. · c17b62e1

Timothy B. Terriberry authored 14 years ago

Change bitreading functions to use a larger window which is refilled less
 often.

This makes it cheap enough to do bounds checking each time the window is
 refilled, which avoids the need to copy the input into a large circular
 buffer.
This uses less memory and speeds up the total decode time by 1.6% on an ARM11,
 2.8% on a Cortex A8, and 2.2% on x86-32, but less than 1% on x86-64.

Inlining vp8dx_bool_decoder_fill() has a big penalty on x86-32, as does moving
 the refill loop to the front of vp8dx_decode_bool().
However, having the refill loop between computation of the split values and
 the branch in vp8_decode_mb_tokens() is a big win on ARM (presumably due to
 memory latency and code size: refilling after normalization duplicates the
 code in the DECODE_AND_BRANCH_IF_ZERO and DECODE_AND_LOOP_IF_ZERO cases.
Unfortunately, refilling at the end of vp8dx_bool_decoder_fill() and at the
 beginning of each decode step in vp8_decode_mb_tokens() means the latter
 requires an extra refill at the end.
Platform-specific versions could avoid the problem, but would require most of
 detokenize.c to be duplicated.

Change-Id: I16c782a63376f2a15b78f8086d899b987204c1c7

c17b62e1

15 Jun, 2010 - 1 commit
- More on "some XMM registers are non-volatile on windows x64 ABI" · 397aad3e
  Yunqing Wang authored 14 years ago
```
Add same fix in subpixel_sse2.asm.

Change-Id: Icfda6103cbf74ec43308e96961dd738aa823c14d
```
  397aad3e
14 Jun, 2010 - 3 commits

vp8_cx_iface: set default cpu used to 0 · 89c8b3db
John Koleszar authored 14 years ago
```
Change-Id: I7b35f4717cdd204224112f72471b551617262417
```
89c8b3db
Fix compiler warnings · 5a72620d
Guillermo Ballester Valor authored 14 years ago
```
Change-Id: I2a97f08cc3c7808ce5be39e910cc5147ecf03a1d
```
5a72620d

sse2 version of vp8_regular_quantize_b · 48c84d13

Scott LaVarnway authored 14 years ago

Added sse2 version of vp8_regular_quantize_b which improved encode
performance(for the clip used) by ~10% for 32 bit builds and ~3% for
64 bit builds.

Also updated SHADOW_ARGS_TO_STACK to allow for more than 9 arguments.

Change-Id: I62f78eabc8040b39f3ffdf21be175811e96b39af

48c84d13

12 Jun, 2010 - 1 commit

Make this/next iiratio unsigned. · cd475da8

John Koleszar authored 14 years ago

This patch addresses issue #79, which is a regression since commit
28de670c "Fix RD bug." If the coded error value is zero, the iiratio
calculation effectively multiplies by 1000000 by the
DOUBLE_DIVIDE_CHECK macro. This can result in a value larger than
INT_MAX, giving a negative ratio. Since the error values are
conceptually unsigned (though they're stored in a double) this patch
makes the iiratio values unsigned, which allows the clamping to work
as expected.

cd475da8

11 Jun, 2010 - 5 commits

Enable vp8_sad16x16x4d_sse3 in non-RTCD case · 59c50966

John Koleszar authored 14 years ago

Typo caused C version of 16x16x4 SAD to be called when built with
--disable-runtime-cpu-detect.

Change-Id: I0fe6fa67280b3a5f13acb3c8ed914f039aaaf316

59c50966

require --enable-psnr to build ssim · 9099fc0d

John Koleszar authored 14 years ago

ssim.c comiles in a huge (512M) amount of global scratch space. Allocating
this data on the heap would be a better solution, but this file doesn't
need to be built at all in most cases, so as a first pass, disable it
except when doing opsnr.stt output (--enable-psnr).

Change-Id: I320d812f6d652a12516a16b52295ebff20b5bd42

9099fc0d

some XMM registers are non-volatile on windows x64 ABI · 63ea8705

Makoto Kato authored 14 years ago

XMM6 to XMM15 are non-volatile on Windows x64 ABI.  We have to save
these registers.

Change-Id: I4676309f1350af25c8a35f0c81b1f0499ab99076

63ea8705

Incorrect comment. · 20f7332b
Paul Wilkins authored 14 years ago
```
(Thanks to Ronald S. Bultje)
```
20f7332b
Use local pointer to pbi->common. · 7a81b29d
Paul Wilkins authored 14 years ago

7a81b29d