Commits · 5b0de48dddf5eb6631ac2f18b654fcd3226ba971 · BC / public / external / libvpx

25 Jul, 2011 - 3 commits
- Merge "Use CONFIG_FAST_UNALIGNED consistently in codec" · 5b0de48d
  Yunqing Wang authored 13 years ago
  
  5b0de48d
- Specify size for argument pushed to stack · fe270dd5
  Yunqing Wang authored 13 years ago
```
The change fixes building error on Win64.

Change-Id: I63d25b26220c4da8a98ca2e36530cbb802468e6b
```
  fe270dd5
- Use CONFIG_FAST_UNALIGNED consistently in codec · 65dfcf46
  Yunqing Wang authored 13 years ago
```
CONFIG_FAST_UNALIGNED is enabled by default. Disable it if it is
not supported by hardware.

Change-Id: I7d6905ed79fed918bca074bd62820b0c929d81ab
```
  65dfcf46
22 Jul, 2011 - 4 commits

Merge "fix sharpness bug and clean up" · 773bcc30
Johann authored 13 years ago

773bcc30

fix sharpness bug and clean up · a04ed0e8

Johann authored 13 years ago

sharpness was not recalculated in vp8cx_pick_filter_level_fast

remove last_filter_type. all values are calculated, don't need to update
the lfi data when it changes.

always use cm->sharpness_level. the extra indirection was annoying.

don't track last frame_type or sharpness_level manually. frame type
only matters for motion search and sharpness_level is taken care of in
frame_init

move function declarations to their proper header

Change-Id: I7ef037bd4bf8cf5e37d2d36bd03b5e22a2ad91db

a04ed0e8

Merge "Preload reference area to an intermediate buffer in sub-pixel motion search" · 829179e8
Yunqing Wang authored 13 years ago

829179e8

Preload reference area to an intermediate buffer in sub-pixel motion search · 20bd1446

Yunqing Wang authored 13 years ago

In sub-pixel motion search, the search range is small(+/- 3 pixels).
Preload whole search area from reference buffer into a 32-byte
aligned buffer. Then in search, load reference data from this buffer
instead. This keeps data in cache, and reduces the crossing cache-
line penalty. For tulip clip, tests on Intel Core2 Quad machine(linux)
showed encoder speed improvement:
  3.4%   at --rt --cpu-used =-4
  2.8%   at --rt --cpu-used =-3
  2.3%   at --rt --cpu-used =-2
  2.2%   at --rt --cpu-used =-1

Test on Atom notebook showed only 1.1% speed improvement(speed=-4).
Test on Xeon machine also showed less improvement, since unaligned
data access latency is greatly reduced in newer cores.

Next, I will apply similar idea to other 2 sub-pixel search functions
for encoding speed > 4.

Make this change exclusively for x86 platforms.

Change-Id: Ia7bb9f56169eac0f01009fe2b2f2ab5b61d2eb2f

20bd1446

21 Jul, 2011 - 6 commits
- Merge "Add .size directive to ARM asm functions." · 52d13777
  Johann authored 13 years ago
  
  52d13777
- Merge "Mark ARM asm objects as allowing a non-executable stack." · ddcdbfd7
  Johann authored 13 years ago
  
  ddcdbfd7
- Add .size directive to ARM asm functions. · 1647f00c
  Timothy B. Terriberry authored 13 years ago
```
This makes them show up properly in debugging tools like gdb and
 valgrind.

Change-Id: I0c72548a1090de88ba226314e5efe63360b7e07f
```
  1647f00c
- Mark ARM asm objects as allowing a non-executable stack. · 0453aca5
  Timothy B. Terriberry authored 13 years ago
```
This adds the magic .note.GNU-stack section at the end of each ARM
 asm file (when built with gas), indicating that a non-executable
 stack is allowed.
Without this section, the linker will assume the object requires an
 executable stack by default, forcing an executable stack for the
 entire program.

Change-Id: Ie86de6a449b52d392b9e5e0479833ed8c508ee65
```
  0453aca5
- Merge "Increase chrow row alignment to 16 bytes." · 2bdda84e
  John Koleszar authored 13 years ago
  
  2bdda84e
- Merge "Add improvements made in good-quality mode to real-time mode" · c5fe6411
  Yunqing Wang authored 13 years ago
  
  c5fe6411
20 Jul, 2011 - 2 commits

Increase chrow row alignment to 16 bytes. · 7d1b37cd

Timothy B. Terriberry authored 13 years ago

This is done by expanding luma row to 32-byte alignment, since
 there is currently a bunch of code that assumes that
 uv_stride == y_stride/2 (see, for example, vp8/common/postproc.c,
 common/reconinter.c, common/arm/neon/recon16x16mb_neon.asm,
 encoder/temporal_filter.c, and possibly others; I haven't done a
 full audit).
It also uses replaces the hardcoded border of 16 in a number of
 encoder buffers with VP8BORDERINPIXELS (currently 32), as the
 chroma rows start at an offset of border/2.
Together, these two changes have the nice advantage that simply
 dumping the frame memory as a contiguous blob produces a valid,
 if padded, image.

Change-Id: Iaf5ea722ae5c82d5daa50f6e2dade9de753f1003

7d1b37cd

encoder: don't set the fragment bit for the last partition · 0afcc769
Attila Nagy authored 13 years ago
```
Change-Id: Icb4e4f0d7c3074a8507852178be87541a1cb5bac
```
0afcc769

19 Jul, 2011 - 4 commits

Merge "Moved vp8_encode_bool into boolhuff.h" · b2d9700f
Scott LaVarnway authored 13 years ago

b2d9700f

Revert "Disable __longjmp_chk protection" · d98a5ed4

John Koleszar authored 13 years ago

This reverts commit b73a3693.

This version of the check doesn't work with generic-gnu, and figuring
out the correct symbol version at configure time is probably more work
than this is worth. May revisit in the future.

Change-Id: I6c75e88bd3bd82a4b21e09a25780fe53aacb7d70

d98a5ed4

remove old armv5 code · 6afafc31

Johann authored 13 years ago

armv5 dequantizer is not referenced

Change-Id: Id1cc617dcee35ebd6a406816ec6aaa26e8bbc8ad

6afafc31

Moved vp8_encode_bool into boolhuff.h · a25f6a9c

Scott LaVarnway authored 13 years ago

allowing the compiler to inline this function.  For real-time
encodes, this gave a boost of 1% to 2.5%, depending on the
speed setting.

Change-Id: I3929d176cca086b4261267b848419d5bcff21c02

a25f6a9c

18 Jul, 2011 - 4 commits

Improved 1-pass CBR rate control · b5ea2fbc

John Koleszar authored 13 years ago

This patch attempts to improve the handling of CBR streams with
respect to the short term buffering requirements. The "buffer level"
is changed to be an average over the rc buffer, rather than a long
running average. Overshoot is also tracked over the same interval
and the golden frame targets suppressed accordingly to correct for
overly aggressive boosting.

Testing shows that this is fairly consistently positive in one
metric or another -- some clips that show significant decreases
in quality have better buffering characteristics, others show
improvenents in both.

Change-Id: I924c89aa9bdb210271f2e03311e63de3f1f8f920

b5ea2fbc

Merge "Disable __longjmp_chk protection" · 74ad25a4
John Koleszar authored 13 years ago

74ad25a4
Merge "Fixed rate histogram calculation" · da39e505
John Koleszar authored 13 years ago

da39e505

Fixed rate histogram calculation · fd41cb84

Tero Rintaluoma authored 13 years ago

Using small values for --buf-sz= in command line causes
floating point exception due to division by zero.

Change-Id: Ibfe2d44db922993a78ebc9a4a1087d9625de48ae

fd41cb84

15 Jul, 2011 - 3 commits
- Merge "Tokenize MB optimized" · e68894fa
  Scott LaVarnway authored 13 years ago
  
  e68894fa
- Merge "Fix vpxenc encoding incorrect webm file header on big endian machines(Issue 331)" · f676171e
  Yunqing Wang authored 13 years ago
  
  f676171e
- Tokenize MB optimized · 4e82f015
  Tero Rintaluoma authored 13 years ago
```
Optimized C-code of the following functions:
 - vp8_tokenize_mb
 - tokenize1st_order_b
 - tokenize2nd_order_b
Gives ~1-5% speed-up for RT encoding on Cortex-A8/A9
depending on encoding parameters.

Change-Id: I6be86104a589a06dcbc9ed3318e8bf264ef4176c
```
  4e82f015
14 Jul, 2011 - 2 commits

bug fix vpx_copy_and_extend_frame size issue · 6b6f367c

James Berry authored 13 years ago

vpx_copy_and_extend_frame could incorrectly
resize uv frames which could result in a crash.

Change-Id: Ie96f7078b1e328b3907a06eebeee44ca39a2e898

6b6f367c

Remove unused speed features · 04dce631

John Koleszar authored 13 years ago

min_fs_radius, max_fs_radius, full_freq were set but never read.

Change-Id: I82657f4e7f2ba2acc3cbc3faa5ec0de5b9c6ec74

04dce631

13 Jul, 2011 - 12 commits

Merge "Better allocate yuv buffers." · 4ab3175b
Fritz Koenig authored 13 years ago

4ab3175b
Merge "Fix unnecessary casting of B_PREDICTION_MODE (issue 349)" · f1f28535
Yunqing Wang authored 13 years ago

f1f28535

Disable __longjmp_chk protection · b73a3693

John Koleszar authored 13 years ago

glibc implements some checking on longjmp() calls by replacing it with
an internal function __longjmp_chk(), when FORTIFY_SOURCE is defined.
This can be problematic when compiling the library under one version of
glibc and running it under another. Work around this issue for the one
symbol affected for now, before taking out the undef hammer.

Fixes http://code.google.com/p/webm/issues/detail?id=166

Change-Id: Ifc5e25cdec17915e394711f2185b3e9214572d10

b73a3693

Fix unnecessary casting of B_PREDICTION_MODE (issue 349) · 139577f9
Yunqing Wang authored 13 years ago
```
Minor fix.

Change-Id: Iaf93f6e47e882a33c479e57c7a0d0bf321e291c0
```
139577f9

Add improvements made in good-quality mode to real-time mode · 0e9a6ed7

Yunqing Wang authored 13 years ago

Several improvements we made in good-quality mode can be added
into real-time mode to speed up encoding in speed 1, 2, and 3
with small quality loss. Tests using tulip clip showed:

--rt --cpu-used=-1
(before change)
PSNR: 38.028
time: 1m33.195s
(after change)
PSNR: 38.014
time: 1m20.851s

--rt --cpu-used=-2
(before change)
PSNR: 37.773
time: 0m57.650s
(after change)
PSNR: 37.759
time: 0m54.594s

--rt --cpu-used=-3
(before change)
PSNR: 37.392
time: 0m42.865s
(after change)
PSNR: 37.375
time: 0m41.949s

Change-Id: I76ab2a38d72bc5efc91f6fe20d332c472f6510c9

0e9a6ed7

Better allocate yuv buffers. · e9751d4b

Fritz Koenig authored 13 years ago

Previously allocated more memory than necessary for yuv buffers.
This makes it harder to track bugs with reading uninitialized
data.

Change-Id: I510f7b298d3c647c869be6e5d51608becc63cce9

e9751d4b

Merge "Reduce motion vector search on alt-ref frame." · 84c3cd79
Fritz Koenig authored 13 years ago

84c3cd79
Merge "Remove rotting NDS_NITRO code." · 7f0b11c0
John Koleszar authored 13 years ago

7f0b11c0
Merge "update x86 asm for loopfilter" · 211694f6
Johann authored 13 years ago

211694f6
Merge "Update armv6 loopfilter to new interface" · 8f910594
Johann authored 13 years ago

8f910594
Merge "Update armv7 loopfilter to new interface" · 1a219c22
Johann authored 13 years ago

1a219c22
Merge "New loop filter interface" · d9b825cf
Johann authored 13 years ago

d9b825cf