- 12 Jul, 2013 - 6 commits
-
-
Dmitry Kovalev authored
Also removing unused declarations from vp9_entropymode.h file. Change-Id: Ib9c5826db3584a32f6bb3297a76c522b99d83402
-
Yaowu Xu authored
Change-Id: I23a75c495ed7ea917d7f312bef0990e20a6b53d9
-
James Zern authored
lg2 -> log2 Change-Id: I0602ddff49e42c9c40c29c084d04b7592b9f8edf
-
Deb Mukherjee authored
Implements some of the helper functions more efficiently with lookups rathers than branches. Modeling function is consolidated to reduce some computations. Also merged the two enums BLOCK_SIZE_TYPES and BlockSize into one because there is no need to keep them separate (even though the semantics are a little different). No bitstream or output change. About 0.5% speedup Change-Id: I7d71a66e8031ddb340744dc493f22976052b8f9f
-
Dmitry Kovalev authored
Removing redundant function arguments and curly braces. Change-Id: I46e02561f33fe02e84a3b19756f03b9504bd6a1b
-
Ronald S. Bultje authored
Change-Id: I78a79fc51c2d7cc3c261f35b569155397f3dc0c4
-
- 11 Jul, 2013 - 5 commits
-
-
Dmitry Kovalev authored
Change-Id: Iccd4ab95ea51a6d57ed43947f2fd7ad92e8979cf
-
Dmitry Kovalev authored
Adding segmentation struct to vp9_seg_common.h. Struct members are from macroblockd and VP9Common structs. Moving segmentation related constants and enums to vp9_seg_common.h. Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03
-
Jingning Han authored
The function encode_block is called only by inter-prediction modes, hence removing the transform type branching there. Change-Id: I34a3172e28ce2388835efd0f8781922211bff857
-
Paul Wilkins authored
With sf->auto_mv_step_size on it is questionable whether sf->reduce_first_step_size is worthwhile. At speed 2 it was not having a big impact. Even at speed 2 sf->optimize_coefficients = 0 is not having a big speed imapct so for now I have moved it down into a higher speed setting. Change-Id: I8a54de76d486ad37aabce76474889da2768b14c1
-
Ronald S. Bultje authored
Change-Id: Ia942e56cf322821d42ba06178672791eeee2847e
-
- 10 Jul, 2013 - 11 commits
-
-
Dmitry Kovalev authored
Change-Id: I0543e72fa092eef3976b65e16bb597197c364873
-
Jingning Han authored
This commit fixed the mis-use of the tx_type for inverse transform in intra4x4 rate-distortion optimization loop. It improves the overall coding performance. Change-Id: I7fe9953175b74890357dbcee33c138573766e980
-
Dmitry Kovalev authored
Change-Id: Ic5257fa8278e9b6297de230e4fd26a1e23ad2bb7
-
Jim Bankoski authored
Change-Id: I5dea4570cb05df27a522abf6e7b695998654284a
-
Jim Bankoski authored
Change-Id: Ie0cb732fdcb98616a422c4463bff80642248d136
-
Deb Mukherjee authored
Adds a speed feature to eliminate full-rd computation if the modeled rd or rd based on a different parameter in the same mode is already a lot larger than the best rd yet. Specifically, only search the sharp and smooth filters if the modeled rd cost based on the regular filter is within a certain factor of the best rd cost so far. Also, skip full-rd computation of non splitmv inter modes if the modeled rd cost based on pred error is within the same factor of the best rd cost so far. Also adds some enhancements in the rd search for splitmv mode to speed things up by early breakouts. Negligible impact on performance. Resuts on derfraw300: psnr: -0.013% with the splitmv enhancements, -0.24% with the rd breakout feature on. speedup: 6% with splitmv enhancements, 20% with also residual breakout (tested on football sequence at 600 Kbps) Change-Id: I37abc308ea9f110c1679ce649b6a7e73ab1ad5fc
-
Jingning Han authored
This commit enables 16x16 ADST/DCT forward hybrid transform using SSE2 operations. It reduces the runtime from 5433 cycles to 1621 cycles, at no compression performance loss. Change-Id: I75fd7f1984e9e28846af459f810ff0d6ae125230
-
Ronald S. Bultje authored
Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min4.9 to 2min3.1, i.e. a 1.4% speedup overall. Change-Id: I9b25e87974430cb942caa276410bb2eda815bd83
-
Yaowu Xu authored
Change-Id: I721ebdeef2b53ce3e5c3eba3f7462ae2103c95a8
-
Jim Bankoski authored
Removes SEG_ID Removes MBSKIP Removes SWITCHABLE_INTERP Removes INTRA_INTER Removes COMP_INTER_INTER Removes COMP_REF_P Removes SINGLE_REF_P1 Removes SINGLE_REF_P2 Removes TX_SIZE Change-Id: Ie4520ae1f65c8cac312432c0616cc80dea5bf34b
-
- 09 Jul, 2013 - 4 commits
-
-
Dmitry Kovalev authored
Change-Id: Ie44824ec25fd8fdb25d7c8124a9b28c26d802029
-
John Koleszar authored
The files are empty and unused. Change-Id: Ieb4242d14273efdf24149bda33f9591540bba06a
-
Ronald S. Bultje authored
Change-Id: I8130ec9b5371c65e885f245a5ac73840c23cb4a1
-
Ronald S. Bultje authored
This probably has a mildly negative impact on performance, but will (in future commits - or possibly merged with this one) allow SIMD implementations of individual intra prediction functions. We may perhaps want to consider having separate functions per txfm-size also (i.e. 4x4, 8x8, 16x16 and 32x32 intra prediction functions for each intra prediction mode), but I haven't played much with that yet. Change-Id: Ie739985eee0a3fcbb7aed29ee6910fdb653ea269
-
- 08 Jul, 2013 - 7 commits
-
-
Ronald S. Bultje authored
The resulting reconstruction is never used, thus it just wastes CPU cycles. Reduces encode time of first 50 frames of bus (speed 0) @ 1500kbps from 2min2.0 to 2min1.2, i.e. a 0.65% overall speedup. Change-Id: I74755ca3aadc21e2be220f486259060bd4088c45
-
Ronald S. Bultje authored
Changes cost_mv_ref() into doing a LUT into pre-calculated cost arrays instead. Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min11.6 to 2min10.9, i.e. 0.5% faster overall. Change-Id: If186e92c34c201b29cbbc058785a15c9c09e433a
-
Ronald S. Bultje authored
First 50 frames of bus @ 1500kbps (speed 0) goes from 2min12.6 to 2min11.6, i.e. 0.75% overall speedup. Change-Id: I67054f8146e82a02b6457c51a1c8627a937e5e1e
-
Ronald S. Bultje authored
Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min4.9 to 2min3.1, i.e. a 1.4% speedup overall. Change-Id: Ibe8b08d159797504c5d0c5122de1b6da3b6595e0
-
Ronald S. Bultje authored
Overall, on all test sets, this gains about +0.2% on all metrics. City is a clip where this really hurts (-1.0% on all metrics), I'm not quite sure why yet. Maybe interesting to look into in the future. Change-Id: I6f0eecb20e72f0194633270d30bf00d76d9eae78
-
Dmitry Kovalev authored
Eliminating usage of mb-units, switching to mi-units. Adding ALIGN_POWER_OF_TWO macro. Change-Id: I2491c969f713207c062011878b57e4e531818607
-
Deb Mukherjee authored
Skips mode searches for intra and compound inter modes depending on the best mode so far and the reference frames. The various heuristics to be used are selected by bits from a flag. The previous direction based intra mode search pruning is also absorbed in this framework. Specifically the flags and their impact are: 1) FLAG_SKIP_INTRA_BESTINTER (skip intra mode search for oblique directional modes and TM_PRED if the best so far is an inter mode) derfraw300: -0.15%, 10% speedup 2) FLAG_SKIP_INTRA_DIRMISMATCH (skip D27, D63, D117 and D153 mode search if the best so far is not one of the closest hor/vert/diagonal directions. derfraw300: -0.05%, about 9% speedup 3) FLAG_SKIP_COMP_BESTINTRA (skip compound prediction mode search if the best so far is an intra mode) derfraw300: -0.06%, about 7-8% speedup 4) FLAG_SKIP_COMP_REFMISMATCH (skip compound prediction search if the best single ref inter mode does not have the same ref as one of the two references being tested in the compound mode) derfraw300: -0.56%, about 10% speedup Change-Id: I1a736cd29b36325489e7af9f32698d6394b2c495
-
- 04 Jul, 2013 - 1 commit
-
-
Dmitry Kovalev authored
Removing set_refs, adding set_ref function. Change-Id: I5635c478b106ae4e57d317f1c83d929644307e63
-
- 03 Jul, 2013 - 6 commits
-
-
Dmitry Kovalev authored
Change-Id: I221126f22ab9067348eb0efb8a73b15a8f49c3fd
-
Jingning Han authored
This commit allows encoder to detect the cumulative rate-distortion cost per transformed block inside a partition. If the cumulative rd cost is already above the best rd value, it terminates the rest operations and continue to next prediction mode test. It reduces the runtime of bus at target bit-rate 2000 from 308 second to 266 second, i.e., about 13% speed-up at no performance penalty. Change-Id: I5f15a3d8955d97031d5653006027866a00654e7a
-
Dmitry Kovalev authored
Change-Id: I65be6acc54c99688fd1f0c946cec3511514b8555
-
Dmitry Kovalev authored
Change-Id: I32276552b3ea6dc1dce8e298be114cfe1019b31c
-
Jingning Han authored
These serve as building blocks for SSE2 8x8 and 16x16 ADST/DCT hybrid transform coding. Change-Id: I4089a754c66e0c986f67d9b8ec4dfb9627ad430d
-
Paul Wilkins authored
When this is 0 (BLOCK_SIZE_AB4X4) we want to do the inter joint search for all sizes. Change-Id: Id40cd6fe7790e7e1165352b9cef5e12fa8c0bc88
-