- 01 Aug, 2013 - 5 commits
-
-
Dmitry Kovalev authored
Change-Id: I27471768980fc631916069f24bc7c482a5c9ca17
-
Dmitry Kovalev authored
Removing assign_and_clamp_mv function, making implementation of clamp_mv and clamp_mv2 more clear and consistent. Change-Id: Iecd08e1c1bf0379f8314ebe01811f8253f4ade58
-
Deb Mukherjee authored
Adds a function to compute source variance for various sb_types to be used for pruning mode and partition searches. [The existing activity measure function is currently specialized for only 16x16 MBs and needs to be updated]. Change-Id: I22a41e6f1430184201487326fdbebb9b47e6fc24
-
Dmitry Kovalev authored
Change-Id: Icd128ab58719e0b9066bdfa66a5d0d427a84d6df
-
Jingning Han authored
This commit removes redundant arguments passing in the function of rd_pick_reference_frame. This resolves the clang warnings about potential use of uninitialized values. Change-Id: Ic68f949a9f8fcd0a583786b0c75321104ea44739
-
- 31 Jul, 2013 - 4 commits
-
-
Dmitry Kovalev authored
Passing mi_row and mi_col parameters to functions explicitly. Removing unused xd argument from scale_mv function. Change-Id: Icb4c495ec72d26fb066c14470d3ae0b741fbf18a
-
Dmitry Kovalev authored
Using different variable names "allow_hp" and "use_hp" instead of "usehp". Change-Id: I0cd5996ddeb46bd754473b680a993c0aaf8eb879
-
Jingning Han authored
Refactor the frame buffer referencing in choose_partition and make it consistent with other places. This means to prevent potential issues when we extend reference frame buffer. Change-Id: I5ff33ed5f671e1f4cc7049622212769a9b4578d9
-
Dmitry Kovalev authored
Using inter-mode counts instead of inter-mode-tree branch counts inside FRAME_COUNTS structure. Change-Id: I60dde13af37d06146d7d15543311c1b5044e9e04
-
- 30 Jul, 2013 - 3 commits
-
-
Adrian Grange authored
Removed unnecessary code lines, replaced switch with an if, fixed spelling errors and formatting. Change-Id: Ie48aa4604aa0ed48362ca359d792fb21b2ec1dc6
-
Yaowu Xu authored
Change-Id: Ica23b66f6664e5a5b168499584f0afffbc54794f
-
Jingning Han authored
The tokenize_b function is only called when output flag is on. Hence removing the conditional branch on it therein. Change-Id: Ib709f47f23f39ca05a695faf86fa3377f11f2dd0
-
- 29 Jul, 2013 - 6 commits
-
-
Jingning Han authored
This commit optimizes the tokenization and detokenization operational flow for speed-up. It makes the coding process about 0.3% faster at speed 0. Change-Id: I28008df7482874e4b5f237f2d418ff82a249dd56
-
Jingning Han authored
This commit makes the encoder skip the redundant tokenization process in the rate-distortion optimization search loop, while updating the entropy contexts accordingly. It makes the speed 0 encoding process about 0.5% faster at no performance change. Change-Id: I34a4155a0b5332afeb45c93a51c7f35a294d685c
-
Jingning Han authored
This commit provides special handle on 16x16 inverse 2D-DCT, where only DC coefficient is quantized to be non-zero value. Change-Id: I7bf71be7fa13384fab453dc8742b5b50e77a277c
-
Dmitry Kovalev authored
Change-Id: I2a6a646570e2af66315e7c658d00d99f80c4b127
-
Dmitry Kovalev authored
Change-Id: I10bf06e3a3d5271221ae6a42a36074d01d493039
-
Dmitry Kovalev authored
Change-Id: I6aa4191935aa93461a07c41b59fdae1eb5f5f107
-
- 27 Jul, 2013 - 2 commits
-
-
Ronald S. Bultje authored
This allows us to increment the position at the band-level only as we go from one band to the next; more importantly, that allows us to use an add instead of multiply instruction, and omit the instruction altogether if the band doesn't change from one coef to the next, thus being slightly faster (probably more noticeable on systems where a multiply is expensive, like arm). Change-Id: I4343fe35b9f9a47fa00b217bdcbf5f91ff96c381
-
Jingning Han authored
This commit brought back the shortcut implementation of 8x8/16x16 inverse 2D-DCT. When the eob <= 10, it skips the inverse transform operations on row 4:7/4:15 in the first round. For bus_cif at 1000 kbps, this provides about 2% speed-up at speed 0. Change-Id: I453e2d72956467d75be4ad8c04b4482ab889d572
-
- 26 Jul, 2013 - 3 commits
-
-
Jingning Han authored
This commit enables a special handle for the 8x8 inverse 2D-DCT, where only DC coefficient is quantized to be non-zero. For bus_cif at 2000 kbps, it provides about 1% speed-up at speed 0. Change-Id: I2523222359eec26b144cf8fd4c63a4ad63b1b011
-
Paul Wilkins authored
Speed feature experiment to set an upper and lower partition size limit based on what has been seen in spatial neighbors. This seems to gives quite reasonable speed gains in local (10-15%) and when used with speed 0 the losses are small (0.25% derf, 0.35% stdhd). However, for now I am only enabling it on speed 1 as there may be clashes with the existing temporal partition selection in speed 2. Using a tighter min / max around the range derived from the neighbors increases speed further but at the cost of a bigger quality loss. However, I think this spatial method could be combined with data from either the last frame or a variance method (or both) to refine the range of minimum and maximum partition size. I.e. consider the min and max from spatial and temporal neighbors and the variance recommendation. Change-Id: I1b96bf8b84368d6aad0c7aa600fe141b4f07435f
-
Yunqing Wang authored
Used 3 * standard_deviation in internal threshold calculation instead of fit curve. This actually approached the algorithm better. For comparison, similar tests were done: The overall psnr loss is less than before. 1. derf set: when static-thresh = 1, psnr loss is 0.329%; when static-thresh = 500, psnr loss is 0.970%; 2. stdhd set: when static-thresh = 1, psnr loss is 0.922%; when static-thresh = 500, psnr loss is 1.307%; Similar speedup is achieved. For example, clip bitrate static-thresh psnr time akiyo(cif) 500 0 48.952 5.077s(50f) akiyo 500 500 48.866 4.169s(50f) parkjoy(1080p) 4000 0 30.388 78.20s(30f) parkjoy 4000 500 30.367 70.85s(30f) sunflower(1080p) 4000 0 44.402 74.55s(30f) sunflower 4000 500 44.414 68.69s(30f) Change-Id: Ic78833642ce1911dbbd1cb6c899a2d7e2dfcc1f3
-
- 25 Jul, 2013 - 7 commits
-
-
Yunqing Wang authored
This option exists in VP8, and it was rewritten in VP9 to support skipping on different partition levels. After prediction is done, we can check if the residuals in the partition block will be all quantized to 0. If this is true, the skip flag is set, and only prediction data are needed in reconstruction. Based on DCT's energy conservation property, the skipping check can be estimated in spatial domain. The prediction error is calculated and compared to a threshold. The threshold is determined by the dequant values, and also adjusted by partition sizes. To be precise, the DC and AC parts for Y, U, and V planes are checked to decide skipping or not. Test showed that 1. derf set: when static-thresh = 1, psnr loss is 0.666%; when static-thresh = 500, psnr loss is 1.162%; 2. stdhd set: when static-thresh = 1, psnr loss is 1.249%; when static-thresh = 500, psnr loss is 1.668%; For different clips, encoding speedup range is between several percentage and 20+% when static-thresh <= 500. For example, clip bitrate static-thresh psnr time akiyo(cif) 500 0 48.923 5.635s(50f) akiyo 500 500 48.863 4.402s(50f) parkjoy(1080p) 4000 0 30.380 77.54s(30f) parkjoy 4000 500 30.384 69.59s(30f) sunflower(1080p) 4000 0 44.461 85.2s(30f) sunflower 4000 500 44.418 78.1s(30f) Higher static-thresh values give larger speedup with larger quality loss. Change-Id: I857031ceb466ff314ab580ac5ec5d18542203c53
-
Dmitry Kovalev authored
Removing unused constants, macros, and function declarations. Using ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving #include from *.h to *.c. Merging for loops for motion vectors. Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13
-
Dmitry Kovalev authored
Change-Id: Ia6144d77ebed66e0739b62e4d673e26a95aa9550
-
Adrian Grange authored
Simplified the code that extracts and uses the motion vectors for the 4 sub-partitions in rd_pick_partition. Change-Id: Iaf698ef7ee3aef9edd59015e1ae065dd359b17d9
-
Jingning Han authored
This commit makes the initialization of trellis coeff optimization a per-plane operation, thereby eliminating the redundant steps in encode_sby and encode_sbuv. It makes the encoder at speed 0 slightly faster. Change-Id: Iffe9faca6a109dafc0dd69dc7273cbdec19b17cd
-
Dmitry Kovalev authored
Moving code from vp9_adapt_mode_context to vp9_adapt_mode_probs. Change-Id: I60829c30b28968cd813551ef3a206dfb98d323c9
-
Yaowu Xu authored
The feature that uses small partition results as a measure to skip mode evaluation at larger partition requires the flags to be reset. The reset was missing in the code path that calls rd_use_partition(). Change-Id: Ia0a3a0aee1a862b6e2333d596808db7c48033d50
-
- 24 Jul, 2013 - 7 commits
-
-
Dmitry Kovalev authored
Change-Id: I61a8b0101eac3ee2e0621d56151b90c269fd4db4
-
Dmitry Kovalev authored
Adding plane type check condition because it was always used outside of get_tx_type_{4x4, 8x8, 16x16}. Change-Id: I02f0bbfee8063474865bd903eb25b54d26e07230
-
Adrian Grange authored
Although local copies of the mode member variables (mode, ref_frame) were made, they were not used in all places. Also, made a local copy of the second_ref_frame member. Change-Id: I84d8c822e5cb3d8a02fc3de8a4037ca3fea8bfad
-
Ronald S. Bultje authored
Prevents doing duplicate IDCTs; encoding of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min4.0 to 1min3.5, i.e. 0.87% faster overall. Change-Id: I2df39e29ed9d5ea5e7d2704a34940ba622832ddd
-
Ronald S. Bultje authored
Encoding time of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min5.4 to 1min4.0, i.e. 2.2% faster overall. Change-Id: I8c32f2aff9a649ce7dd49d910dc5ba16b99c3bc6
-
Adrian Grange authored
Change-Id: Id4138293efeac4503b2e01ce7a6c150a5abeef77
-
Dmitry Kovalev authored
Counts are separate from frame context. We have several frame contexts but need only one copy of all counts. Change-Id: I5279b0321cb450bbea7049adaa9275306a7cef7d
-
- 23 Jul, 2013 - 3 commits
-
-
Jingning Han authored
The struct optimize_block_args is defined same as encode_b_args. Remove this redundant definition, and use encode_b_args consistently. Change-Id: I1703aeeb3bacf92e98a34f4355202712110173d9
-
Dmitry Kovalev authored
Change-Id: I78d16ee758e1fae0200b746f00031f6d9c6d6ce7
-
Adrian Grange authored
Several consecutive for loops executed over the same index range, so I rolled them into one. Change-Id: I5cfcc8c38c738478965768409cca9d09adf224e1
-