Commits · 65234504b95b4bc9a155539c16e457223d2e6c25 · BC / public / external / libvpx

05 Aug, 2014 - 1 commit

Directly split the block in partition search · 74593c1e

Pengchong Jin authored 10 years ago

This patch allows the encoder to directly split the block
in partition search, therefore skip searching NONE. It
computes a score which measures whether 16x16 motion vectors
from the first pass in the current block are consistent with
each others. If they are inconsistent and we have enough Q
to encode, split the block directly, and skip searching NONE.

This feature is under flag CONFIG_FP_MB_STATS. In speed 2,
it further gives a speedup of 3-8% on sample yt clips as
compared to the previous version under the same flag. Overall,
the features under the flag will give 7-15% on typical yt
clips at up to 6000kbps data rate. The speedup at very high
data rate is not significant.

For hard stdhd clips:
park_joy_1080p @ 15000kbps:       504541ms -> 506293ms (-0.35%)
pedestrian_area_1080p @ 2000kbps: 326610ms -> 290090ms (+11.2%)

The compression performance using the features under the flag:
derf: -0.068%
yt:   -0.189%
hd:   -0.318%
stdhd:-0.183%

To use the feature, set CONFIG_FP_MB_STATS and turn on
cpi->use_fp_mb_stats.

Change-Id: Iad58a2966515c8861aa9eb211565b1864048d47f

74593c1e

04 Aug, 2014 - 2 commits

Store first pass motion vector directions · 233e0ccc

Pengchong Jin authored 10 years ago

Re-organize the one-byte structure for 16x16 first pass
block. Add bits to indicate motion vector directions.

Change-Id: Id10754ba343dfc712c7fed5bcc85c67fa0bbcb89

233e0ccc

break at the end of clauses with assert(0) to avoid gcc warning · 7f63dabf
Jim Bankoski authored 10 years ago
```
Change-Id: I1b3c5337f018dde27dc819ab18bd081d169a91e8
```
7f63dabf

31 Jul, 2014 - 1 commit

Skip calling vp9_block_energy when aq-mode is off · 1c3a80b9

Jingning Han authored 10 years ago

The mb_energy value is used by aq-mode. Turn off computing its
value when aq-mode is off.

Change-Id: I26c239f124eca45a5ee58b90d19eae00d9a7cda5

1c3a80b9

30 Jul, 2014 - 3 commits

Early termination after partition NONE is done in RD. · 49866baa

Pengchong Jin authored 10 years ago

This patch allows the encoder to skip the search for partition
SPLIT, HORZ, VERT after the search for partition NONE is done
in RD optimization. It uses the first pass block-wise statistics
to make the decision. If all 16x16 blocks in the current partition
have zero motions and small residues from the frist pass statistics,
and it has small difference variance, further partition search is
skipped.

For speed 2 setting, experiments on general youtube clips show that
the speedup varies from 1% - 10%, 5% on average. On the performance
side in PSNR, derf 0.004%, yt -0.059%, hd -0.106%, stdhd 0.032%.

For hard stdhd clips:
park_joy_1080p, 502952 ms -> 503307 ms (-0.07%)
pedestrian_area_1080p, 227049 ms -> 220531 ms (+3%)

This feature is under the compilation flag CONFIG_FP_MB_STATS and
it is off in current setting.

Change-Id: I554537e9242178263b65ebe14a04f9c221b58bae

49866baa

Refactor rd_pick_parition interface · d82ff942

Jingning Han authored 10 years ago

Remove the variable that indicates the relative block index. This
is explicitly covered by the use of pc_tree.

Change-Id: Ib13142582fff926c85e375bde656aa050add8350

d82ff942

Chessboard pattern partition search · ca2dcb7f

Jingning Han authored 10 years ago

This commit enables a chessboard pattern constrained partition
search for 720p and above resolutions. The scheme applies stricter
partition search to alternative blocks based on its above/left
neighboring blocks' partition range, as well as that of the
collocated blocks in the previous frame. It is currently turned
on at 16x16 block size level. The chessboard pattern is flipped
per coding frame.

The speed 3 runtime is reduced:
park_joy_1080p, 652832 ms -> 607738 ms (7% speed-up)
pedestrian_area_1080p, 215998 ms -> 200589 ms (8% speed-up)

The compression performance is changed:
hd     -0.223%
stdhd  -0.295%

Change-Id: I2d4d123ae89f7171562f618febb4d81789575b19

ca2dcb7f

29 Jul, 2014 - 2 commits

Clean up max/min allowed block size in rd_pick_partition · 6646ea73

Jingning Han authored 10 years ago

This commit replace the repetitive retrieve of max and min allowed
partition from speed_feature with local variables max_size and
min_size.

Change-Id: Ib06f11f16615e4876e4dd5fb6a968c6bf5f7b216

6646ea73

Use frame index directly in get_chessboard_index · c36f78b0

Jingning Han authored 10 years ago

The get_chessboard_index() used to call the entire VP9_COMMON
struct pointer to retrieve the chessboard pattern index. This cl
makes it call the frame index directly.

Change-Id: I3cad9d209ea2e77a358085a04fe1ff0ddec5ba03

c36f78b0

25 Jul, 2014 - 1 commit

Fix rd_pick_partition search loop for 4x4 blocks · 84af0486

Jingning Han authored 10 years ago

The partition search for 4x4 blocks takes unnecessary steps to
reconstruct pixels and an extra partition type update. This commit
removes such operations. No visible compression/speed difference.
Thanks to Yue (yuec@) for finding this issue.

Change-Id: I3f83824aa3fd3717d63be0b280fa57258939a70a

84af0486

24 Jul, 2014 - 1 commit

s/CONFIG_DENOISING/CONFIG_VP9_TEMPORAL_DENOISING · 9d337d34

Tim Kopp authored 10 years ago

This should prevent confusion with the VP8 CONFIG_TEMPORAL_DENOISING and other
flags.

Change-Id: I1fe4e2977895b7966841d861ab74317ad875b6c8

9d337d34

22 Jul, 2014 - 1 commit

Fix get_frame_type function · caad1686

Adrian Grange authored 10 years ago

Fixed the function get_frame_type to return the correct
frame type for golden and last frames.

Change-Id: I8edddd9aa26cbe7a1de8ff211389410b22b1bd14

caad1686

21 Jul, 2014 - 2 commits

Remove unfinished VP9 alpha channel. · 5926e7c0
Alex Converse authored 10 years ago
```
Change-Id: Ic5d3a3a0dac10b49495771886a31e793bb78b5ca
```
5926e7c0

Add -DNDEBUG when config option debug is disabled · 765485ca

Yunqing Wang authored 10 years ago

For gcc, when libvpx config option debug is disabled, added the
flag -DNDEBUG to disable the assertions in libvpx for some speedup.

Change-Id: Ifcb7b9e8ef5cbe5d07a24407b53b9a2923f596ee

765485ca

17 Jul, 2014 - 1 commit

Fixed a bug of setting wrong first pass mb stats pointer · e358ab5f

Pengchong Jin authored 10 years ago

The bug sets the wrong pointer to the first pass mb stats
if the encoder does the re-coding in the second pass.

Change-Id: I8a11f45dd7dceb38de814adec24cecccae370d00

e358ab5f

15 Jul, 2014 - 1 commit

VP9 Denoiser denoises after mode/bsize search · 03819ed9

Tim Kopp authored 10 years ago

In vp8, statistics are collected about the different modes as they are searched.
This process is more complicated due to the variable block size. Fields were
added to the PICM_MODE_CONTEXT struct to hold this information for each point in
the search. The information is then taken from the appropriate part of the tree
during denoising.

Change-Id: I89261ab77ad637821287ae157dfdf694702b8e77

03819ed9

11 Jul, 2014 - 1 commit

Code refactoring: use defined inline functions · 1b5e9871

Yunqing Wang authored 10 years ago

Changed to use defined inline functions consistently through
the code.

Change-Id: I7644d24fa7a837378564a6e0790416d3725dd200

1b5e9871

07 Jul, 2014 - 1 commit
- Remove an empty line · 3316918b
  Jingning Han authored 10 years ago
```
Change-Id: Id6eedc502c86433df1456dd994aee6bc9a1359a2
```
  3316918b
02 Jul, 2014 - 2 commits

Split vp9_rdopt into vp9_rdopt and vp9_rd. · 03c276ea

Alex Converse authored 10 years ago

vp9_rdopt is for making rd optimal mode decisions. vp9_rd is for all
other rd related routines. Anything used outside of making an rd optimal
decision belongs in rd.

Change-Id: I772a3073f7588bdf139f551fb9810b6864d8e64b

03c276ea

Re-design quantization process · 9ac2f663

Jingning Han authored 10 years ago

This commit re-designs the quantization process for transform
coefficient blocks of size 4x4 to 16x16. It improves compression
performance for speed 7 by 3.85%. The SSSE3 version for the
new quantization process is included.

The average runtime of the 8x8 block quantization is reduced
from 285 cycles -> 255 cycles, i.e., over 10% faster.

Change-Id: I61278aa02efc70599b962d3314671db5b0446a50

9ac2f663

01 Jul, 2014 - 1 commit

Fix visual studio build issue · 9ba1d60b

Yunqing Wang authored 10 years ago

Fixed the signed/unsigned mismatch.

Change-Id: Id83d603b8f1745b71f4cf695a0751e55518b1316

9ba1d60b

30 Jun, 2014 - 3 commits

change to not force interp_type as SWITCHABLE · 186bd4eb

Yaowu Xu authored 10 years ago

Encoder still uses SWITCHABLE as default via DEFAULT_INTERP_FILTER,
but does not override the default if it is not SWITCHABLE.

Change-Id: I3c0f6653bd228381a623a026c66599b0a87d01d5

186bd4eb

Remove unused set_mode_info function · 30ab3701

Jingning Han authored 10 years ago

When the frame is intra coded only, the encoder takes the RD
coding flow. Hence the function set_mode_info is not practically
in use. This commit removes it and the associated conditional
branches.

Change-Id: I1e42659ceb55b771ba712d1cdecacb446aa6460d

30ab3701

Decide the partitioning threshold from the variance histogram · 9d41313e

Yunqing Wang authored 10 years ago

Before encoding a frame, calculate and store each 16x16 block's
variance of source difference between last and current frame.
Find partitioning threshold T for the frame from its variance
histogram, and then use T to make partition decisions.

Comparing with fixed 16x16 partitioning, rtc set test showed an
overall psnr gain of 3.242%, and ssim gain of 3.751%. The best
psnr gain is 8.653%.

The overall encoding speed didn't change much. It got faster for
some clips(for example, 12% speedup for vidyo1), and a little
slower for others.

Also, a minor modification was made in datarate unit test.

Change-Id: Ie290743aa3814e83607b93831b667a2a49d0932c

9d41313e

29 Jun, 2014 - 1 commit
- remove unused parms from rd_pick_inter_mode_sb_seg_skip · a13bf653
  Jim Bankoski authored 10 years ago
```
Change-Id: I7f989d197444d166133ad91eb23ac1033109f58d
```
  a13bf653
26 Jun, 2014 - 2 commits

Adaptive txfm size selection depending on residual sse/variance · 5a3e3c6d

Jingning Han authored 10 years ago

This commit enables an adaptive transform size selection method
for speed -6. It uses largest transform size when the sse is more
than 4 times of variance, i.e., most energy is compacted in the
DC coefficient. Otherwise, use the default TX_8X8. It improves
the compression efficiency for rtc set of speed -6 by 0.8%, no
speed change observed.

Change-Id: Ie6ed1e728ff7bf88ebe940a60811361cdd19969c

5a3e3c6d

Skip the partition search for the frame with no motion · 12861260

Pengchong Jin authored 10 years ago

This patch allows the encoder to skip the partition search for the
frame if it is an inter frame and only zero motion vectors have
been detected in the first pass. The partition size is directly
assigned according to the difference variance.

Borg tests show overall little performance changes in term of PSNR
(derf -0.027%, yt 0.152%, hd 0.078%, stdhd 0%). The worst case of
PSNR loss is -0.514% from yt. The best PSNR gain is 4.293% from yt.
The second pass encoding speedup for slideshow clips is 15%-40%.

Change-Id: I881f347d286553ee5594a9ea09ba1a61ac684045

12861260

24 Jun, 2014 - 2 commits

Reuse inter prediction result in real-time speed 6 · 0aae1000

Yunqing Wang authored 10 years ago

In real-time speed 6, no partition search is done. The inter
prediction results got from picking mode can be reused in the
following encoding process. A speed feature reuse_inter_pred_sby
is added to only enable the resue in speed 6.

This patch doesn't change encoding result. RTC set tests showed
that the encoding speed gain is 2% - 5%.

Change-Id: I3884780f64ef95dd8be10562926542528713b92c

0aae1000

Fix some bugs in multi-arf · 8160a26f

Paul Wilkins authored 10 years ago

Fix some bugs relating to the use of buffers
in the overlay frames.

Fix bug where a mid sequence overlay was
propagating large partition and transform sizes into
the subsequent frame because of :-
  sf->last_partitioning_redo_frequency  > 1 and
  sf->tx_size_search_method == USE_LARGESTALL

Change-Id: Ibf9ef39a5a5150f8cbdd2c9275abb0316c67873a

8160a26f

20 Jun, 2014 - 3 commits
- Switch active map implementation to segment based. · aeacaac5
  Alex Converse authored 10 years ago
```
Change-Id: Ibb841a1fa4d08d164cf5461246ec290f582b1f80
```
  aeacaac5
- Fork vp9_rd_pick_inter_mode_sb_seg_skip · e8a4edf4
  Alex Converse authored 10 years ago
```
Change-Id: I549868725b789f0f4f89828005a65972c20df888
```
  e8a4edf4
- Actually skip blocks in skip segments in non-rd encoder. · 173a86b2
  Alex Converse authored 10 years ago
```
Copy split from macroblock to pick mode context so it doesn't get lost.

Change-Id: Ie37aa12558dbe65c4f8076cf808250fffb7f27a8
```
  173a86b2
12 Jun, 2014 - 3 commits

Replacing txfm_size with tx_size. · 4345d12d
Dmitry Kovalev authored 10 years ago
```
Change-Id: Ifa6374e9db5919322733b656e0865f5f19ee6f2c
```
4345d12d

Fast computation path for forward transform and quantization · ccba289f

Jingning Han authored 10 years ago

This commit enables a fast path computational flow for forward
transformation. It checks the sse and variance of prediction
residuals and decides if the quantized coefficients are all
zero, dc only, or more. It then selects the corresponding coding
path in the forward transformation and quantization stage.

It is currently enabled in rtc coding mode. Will do it for rd
coding mode next.

In speed -6, the runtime for pedestrian_area 1080p at 1000 kbps
goes down from 14234 ms to 13704 ms, i.e., about 4% speed-up.
Overall coding performance for rtc set is changed by -0.18%.

Change-Id: I0452da1786d59bc8bcbe0a35fdae9f623d1d44e1

ccba289f

Fix SEG_LVL_SKIP in non-RD inter mode selection. · 6c3f311b

Alex Converse authored 10 years ago

Add a set_mode_info_seg_skip function that fills the requisite mode info.

Change-Id: I460b1b6845d720d9b09ed5b64df0ea0aac443f62

6c3f311b

09 Jun, 2014 - 1 commit

Use small transform size in non-rd real-time mode · b04d7668

Yunqing Wang authored 10 years ago

In non-rd real-time mode, choosing smaller transform size in
encoding gives better video quality and good speed gain than
choosing larger transform size. This patch set tx size search
method to ALLOW_8X8, which is better than using 4x4 or other
larger sizes.

Borg tests on rtc set at speed 6 showed significant gain on quality.
PSNR gain: 11.034% and SSIM gain: 15.466%.

The speed gain is 5% - 12% for <720p clips, and 2% - 7% for
720p clips.

Change-Id: If4dc74ed2df359346b059f47fb73b4a0193ec548

b04d7668

06 Jun, 2014 - 2 commits
- Removing chessboard_index from SPEED_FEATURES. · 923c30a1
  Dmitry Kovalev authored 10 years ago
```
This is not a speed feature, adding inline function instead.

Change-Id: Ia48c41802eec9e92cf990339d724097279695c9a
```
  923c30a1
- Adding encode_tiles() function. · 31403fd7
  Dmitry Kovalev authored 10 years ago
```
Change-Id: Ib8187c8f2556e1e9268b0683cd2b6ff3489f0205
```
  31403fd7
05 Jun, 2014 - 2 commits
- Removing unused tt_activity_measure(). · 580d72d3
  Dmitry Kovalev authored 10 years ago
```
Change-Id: Ifcb46e6904730d14b9ef76b648b4d0dc3cd5d0c5
```
  580d72d3
- Removing unused motion_vector_context enum from vp9_encodeframe.c · 85677393
  Dmitry Kovalev authored 10 years ago
```
The same enum defined and used in vp9_mvref_common.c.

Change-Id: I3975103997797add0a258d36c96d20ac9561a73d
```
  85677393