- 02 Jul, 2013 - 18 commits
-
-
Yaowu Xu authored
This commit adds a speed feature where only squared partition are evaluated in partition picking. Enable this feature in cpu-used 2 reduces encoding time by ~30%. loss of compression: -0.9% on cif set -1.23% on stdhd Change-Id: Ia6fad11210f0b78365abb889f9245604513be5b9
-
Deb Mukherjee authored
This speed feature will skip searching the directional intra prediction modes D63, D117, D27, D153 if the best intra mode so far is not one of the diagonal, horizontal or vertical directions closest to the respective directions being tested. In other words, this implements a sort of binary search in the angular domain. Speedup: about 9-10% Results: -0.05% only on derfraw300. Change-Id: I413584c41f2a3e8dabfbdeb40718c8fc4b1d63a2
-
Deb Mukherjee authored
-
Deb Mukherjee authored
(1) Refines the modeling function and uses that to add some speed features. Specifically, intead of using a flag use_largest_txfm as a speed feature, an enum tx_size_search_method is used, of which two of the types are USE_FULL_RD and USE_LARGESTALL. Two other new types are added: USE_LARGESTINTRA (use largest only for intra) USE_LARGESTINTRA_MODELINTER (use largest for intra, and model for inter) (2) Another change is that the framework for deciding transform type is simplified to use a heuristic count based method rather than an rd based method using txfm_cache. In practice the new method is found to work just as well - with derf only -0.01 down. The new method is more compatible with the new framework where certain rd costs are based on full rd and certain others are based on modeled rd or are not computed. In this patch the existing rd based method is still kept for use in the USE_FULL_RD mode. In the other modes, the count based method is used. However the recommendation is to remove it eventually since the benefit is limited, and will remove a lot of complications in the code (3) Finally a bug is fixed with the existing use_largest_txfm speed feature that causes mismatches when the lossless mode and 4x4 WH transform is forced. Results on derf: USE_FULL_RD: +0.03% (due to change in the tables), 0% encode time reduction USE_LARGESTINTRA: -0.21%, 15% encode time reduction (this one is a pretty good compromise) USE_LARGESTINTRA_MODELINTER: -0.98%, 22% encode time reduction (currently the benefit of modeling is limited for txfm size selection, but keeping this enum as a placeholder) . USE_LARGESTALL: -1.05%, 27% encode-time reduction (same as existing use_largest_txfm speed feature). Change-Id: I4d60a5f9ce78fbc90cddf2f97ed91d8bc0d4f936
-
Deb Mukherjee authored
Uses mapping tables instead of complicated modulo/division operations for prob mapping for forward updates. No bit-stream or output change. Change-Id: Ifd9ce8ac1437835c305c94f64c18273c7a68f546
-
Dmitry Kovalev authored
-
Ronald S. Bultje authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
Change-Id: I8a2983fb14274a6ac53681fa4cd5d4209cbd2905
-
Yunqing Wang authored
-
Yunqing Wang authored
Added a speed feature in speed 1 to disable splitmv for HD (>=720) clips. Test result on stdhd set: 0.3% psnr loss and 0.07% ssim loss. Encoding speedup is 36%. (For reference: The test result on derf set showed 2% psnr loss and 1.6% ssim loss. Encoding speedup is 34%. SPLITMV should be enabled for small resolution videos.) Change-Id: I54f72b94f506c6d404b47c42e71acaa5374d6ee6
-
Jingning Han authored
Compute the rate-distortion cost per transformed block, and cumulate the cost through all blocks inside a partition. This allows encoder to detect if the cumulative rd cost is already above the best rd cost, thereby enabling early termination in the rate-distortion optimization search. Change-Id: I0a856367a9a7b6dd0b466e7b767f54d5018d09ac
-
Ronald S. Bultje authored
-
Paul Wilkins authored
This reverts commit 13772781. Also fixes a spelling mistake. Change-Id: I5be8aa4d8d3c0323d4a6f41968a7b2c048949c3f
-
Yaowu Xu authored
Change-Id: Icc4f70f0b0f91c9e7d5d00eedd67841afe2f2679
-
Jim Bankoski authored
This cl converts use partition from last frame to do the following: if part is none,horz, vert -> try split if part != none and one of the children is not split - try none Change-Id: I5b6c659e35f3ac9f11c051b92ba98af6d7e8aa87 Signed-off-by:
Jim Bankoski <jimbankoski@google.com>
-
Dmitry Kovalev authored
Change-Id: Ia547a5dd7650b771fd00edd673ab9f920270731c
-
- 01 Jul, 2013 - 15 commits
-
-
Ronald S. Bultje authored
This should significantly speedup cost_coeffs(). Basically what the patch does is to make the neighbour arrays padded by one item to prevent an eob check in get_coef_context(), then it populates each col/row scan and left/top edge coefficient with two times the same neighbour - this prevents a single/double context branch in get_coef_context(). Lastly, it populates neighbour arrays in pixel order (rather than scan order), so we don't have to dereference the scantable to get the correct neighbours. Total encoding time of first 50 frames of bus (speed 0) at 1500kbps goes from 2min10.1 to 2min5.3, i.e. a 2.6% overall speed increase. Change-Id: I42bcd2210fd7bec03767ef0e2945a665b851df56
-
Dmitry Kovalev authored
Change-Id: I5b413bc0884af0bda38c05332d86490103905b3b
-
Yaowu Xu authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Ronald S. Bultje authored
Encode time of bus (speed 0) 50 frames @ 1500kbps goes from 2min14.4 to 2min10.1, i.e. a 2.3% overall speed increase. Change-Id: I3699580e74ec26c7d24e03681bc47ba25ee1ee87
-
Ronald S. Bultje authored
Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is x86-64 only, it needs some minor modifications to be 32bit compatible, because it uses 15 xmm registers, whereas 32bit only has 8. Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904
-
Dmitry Kovalev authored
Moving vp9_default_inter_mode_probs array to vp9_entropymode.c. Change-Id: I88ebda86ccc07f2a43c6c01d4b37898214cfb6de
-
Paul Wilkins authored
-
Yaowu Xu authored
Change-Id: I921c9faba6386535aaf717a54301dd346a9b8540
-
Paul Wilkins authored
Added a speed feature that focuses only on thresholds for new motion modes. Moved sf->comp_inter_joint_search_thresh into speed 1. This has ~+0.4% impact on quality at speed 0 as our quality reference baseline. Slight adjustment to baseline thresholds. Change-Id: I7ebf104f1fe29af77ed4837b2e84be065621bbe5
-
Dmitry Kovalev authored
Change-Id: I30ea91561ffac7e5065ba41b2d3ab7dedb720593
-
- 29 Jun, 2013 - 7 commits
-
-
Jingning Han authored
-
Christian Duvivier authored
43,000 -> 5,750 cycles, about 7.5x faster. Change-Id: Ibfd92821b9603f4ed9c256e0ececec14fa4565d0
-
Dmitry Kovalev authored
Change-Id: I83ca53bf6def871f199a382a671f26ad7cbecbca
-
Ronald S. Bultje authored
-
Johann authored
-
James Zern authored
-
Ronald S. Bultje authored
-