- 29 Jul, 2017 1 commit
-
-
Marco authored
Move the source_sad feature to speed 6 (from speed 7), and add speed feature to switch from the variance-based partition to reference_partition (which uses nonrd-pickmode for bsize selection) if source_sad is high. Currently used only for speed 6 for resoln <= 360p. About 4-5% improvement on 360p in RTC set. Some speed slowdown, but still ~30% faster than speed 5. Change-Id: Ib0330ee5fe9fdd2608aed91359a2a339d967491c
-
- 22 Jul, 2017 1 commit
-
-
James Zern authored
For 8-bit the subtrahend is small enough to fit into uint32_t. For 10/12-bit apply: 63a37d16 Prevent negative variance previously: 47b9a091 Resolve -Wshorten-64-to-32 in highbd variance. c0241664 Resolve -Wshorten-64-to-32 in variance. Change-Id: I181c85f0b9a03da37c2e8b89482d48aa3dbc0aee
-
- 17 Jul, 2017 2 commits
-
-
Marco authored
When content_state_sb is set to LowVarHighSumdiff, don't reset it to VeryHighSad. Visually better on clips with strong lighting changes. Small/negligible change in RTC metrics and speed. Change-Id: I20c383e3c4cf8d1149de5f9260449c0b7cf7c6aa
-
Marco authored
When int_pro_motion_estimation is done for superblock in choose_partitioning, use it to avoid the full_pixel_search for NEWMV mode, if bsize is >= 32X32. For speed > 7. Small/neutral change on RTC metrics. ~1-2% speedup on arm on high motion clip. Change-Id: I3cfe6833ff4bf75d4afa83eaf058ad45729de85b
-
- 14 Jul, 2017 1 commit
-
-
Marco authored
Only affects speed 7. Improvement on high motion clips. Change-Id: Ibddb68fed9c63207df29ffd790f9205b1cecf687
-
- 11 Jul, 2017 1 commit
-
-
Jerome Jiang authored
Change-Id: Iebc9dd293d8b1449c0674c0295349297e9b90646
-
- 07 Jul, 2017 1 commit
-
-
James Zern authored
+ vpx_dsp/, test/ itxfm -> inv_txfm, ftxfm -> fwd_txfm Change-Id: I3aacdb65143576d64cfe5c9b14dd358c17c1fe7e
-
- 06 Jul, 2017 1 commit
-
-
Marco authored
In the content_state for a superblock is set to HighSad, use that to bias some decisions in variance partition and nonrd pickmde: use int_pro_motion for sad computation in choose_partitioning, and set large_block in pickmode based on the content_state_sb. Only affects speed >= 7. Immprovement for high motion content. Small gain (~1%) in RTC metrics. Speedup of ~5 for high motion clip on android (speed 8, 1 thread). Change-Id: I5774c4854f012b89c8e969f6129b60988c2ce11c
-
- 29 Jun, 2017 1 commit
-
-
James Zern authored
txfm is more commonly used as an abbreviation through the codebase Change-Id: I86fd90ef132468f9da270091c05daa1f5a49ece2
-
- 27 Jun, 2017 1 commit
-
-
Jerome Jiang authored
This could save some cycles since skin detection is used in multiple places in vp9. 1~2% speed up on ARM. Change-Id: I86b731945f85215bbb0976021cd0f2040ff2687c
-
- 22 Jun, 2017 3 commits
-
-
Marco authored
Use it to limit NEWMV early exit in nonrd pickmode Small change in RTC metrics, has some improvement for high motion clips. Change-Id: I1d89fd955e1b3486d5fb07f4472eeeecd553f67f
-
James Zern authored
quiets -Wmissing-prototypes Change-Id: I696223d75860edba13c6b6f38c1f8db353a6f812
-
Marco authored
Skin detection usage in choose_partitioning should be around the cpi->use_skin_detection. Change-Id: I6986179af9ce94c60c0974d66c311fc07cc04cfe
-
- 18 May, 2017 2 commits
-
-
Marco authored
When temporal layers are used, only allow for copy partition on the top temporal enhancement layer frames. Change-Id: I5472abdc0f9f6c8dafa75a7a84c615e08ae22af8
-
Marco authored
Only affects speed 8. Make changes to copy partition to fix a bug in setting microblock offset. Avg PSNR shows 0.02% gain on rtc_derf and 0.08% loss on rtc. Change-Id: I61c3e5914dde645331344388e7437e5638acd4f3
-
- 11 May, 2017 1 commit
-
-
Marco authored
Increase the partition and acskip thresholds for temporal enhancement layers. ~1-2% speedup, with negligible loss in quality. Change-Id: Id527398a05855298ad9ddac10ada972482415627
-
- 25 Apr, 2017 2 commits
-
-
Jerome Jiang authored
For speed >= 8 and color_sensitivity not set, skip the transform skipping test in UV planes. Add a new condition to check noise level to skip chroma check for speed >= 8 if y_sad is high. 1~2% speedup on ARM for speed 8. Borg tests show neutral results in both rtc and rtc_derf. Change-Id: Idecd3ff6e28c97757a43bb6f3a7082c85f72109c
-
Marco authored
Add a low-variance high-sumdiff to the superblock content state and use it to limit the mv and bias some decisions in non-rd pickmode. Only affects speed >= 6. Reduces artifact for lighting changes. Small/no difference in metrics on RTC set. Change-Id: Ic84b2379fe0ae3fa71ae826ee6bae3eaf551a25b
-
- 24 Apr, 2017 1 commit
-
-
Yunqing Wang authored
This patch followed allow_exhaustive_searches feature modification and continued to modify the encoder to achieve the determinism in the row based multi-threaded encoding. While row-mt = 1 and using multiple threads, the adaptive feature in encoder was disabled, which gave BDRate gain(at speed 1, -0.6% ~ -0.7%; at speed 2, -0.46% ~ -0.59%), but some encoder speed losses(7% ~ 10% at speed 1 and 3% ~ 6% at speed 2). These speed losses were acceptable considering the speed gains obtained from row-mt. Change-Id: I60d87a25346ebc487a864b57d559f560b7e398bb
-
- 21 Apr, 2017 1 commit
-
-
Yunqing Wang authored
A previous patch turned on allow_exhaustive_searches feature only for FC_GRAPHICS_ANIMATION content. This patch further modified the feature by removing the exhaustive search limit, and made it no longer adaptive. As a result, the 2 counts that recorded the number of motion searches were removed, which helped achieve the determinism in the row based multi-threading encoding. Tests showed that this patch didn't cause the encoder much slower. Used exhaustive_searches_thresh for this speed feature, and removed allow_exhaustive_searches. Also, refactored the speed feature code to follow the general speed feature setting style. Change-Id: Ib96b182c4c8dfff4c1ab91d2497cc42bb9e5a4aa
-
- 20 Apr, 2017 1 commit
-
-
Marco authored
The more aggressive settings should only be used when denoise_svc condition is satisfied (which means top spatial layer). Change-Id: Ia8e3515b27f31bf21b1976ca80a2fa826daece3a
-
- 11 Apr, 2017 1 commit
-
-
Jerome Jiang authored
Set adaptive_rd_thresh to 2 when simple block yrd is not used. Fix regression caused by computing y sad without int_pro_motion_estimation on low res motion clips. Overall 0.07% quality loss on rtc_derf. Change only affects low res on speed 8. Change-Id: Ic6a188a56529f1034d6431005fb4b0e24e8a7e27
-
- 10 Apr, 2017 1 commit
-
-
Marco authored
For speed 5, 1 pass CBR: Don't use the nonrd_pick_partition on the segment, rather use choose_partitioning followed by nonrd_select_partition (as is done on base segment). Little/no quality loss on RTC and RTC_derf (< 0.3%), speedup of at least 5%. Change-Id: I5273d5f950e60adf5e437b4ca8c4f63964641e83
-
- 06 Apr, 2017 2 commits
-
-
James Zern authored
vp9_high_get_sby_perpixel_variance the variance operated on in is already in 32-bits Change-Id: I97006eb9c08dbd0f88ee35e1a1ca205737508296
-
Jerome Jiang authored
Little change in overall PSNR in rtc. 2-4% speedup on VGA on ARM. Change-Id: I3395806d7afd456deacd4077c330adca13ab0645
-
- 05 Apr, 2017 1 commit
-
-
Marco authored
Temporal denoiser runs in non-rd pickmode, so it is only used for speed >= 5. Regression exists for speed 5, due to use of reference_partition (which use non-rd pickmode for partitioning). Avoid denoising for now at speed 5. Change-Id: I74a74d2e1404d7cfd33dcf4ec06dd2e503256cf0
-
- 31 Mar, 2017 1 commit
-
-
Yunqing Wang authored
The row mt sync read uses sync_range = 1, and wouldn't work if we want to use a sync_range that is greater than 1. To make it work, this sync read code is modified. Pass in col instead of col - 1 to make it consistent with other row mt code in VP9, and then add 1 in "while" codition. Change-Id: I4a0e487190ac5d47b8216368da12d80fec779c1a
-
- 27 Mar, 2017 2 commits
-
-
Marco authored
For non-rd variance partition, avoid the chrome check unless y_sad is below some threshold. Small decrease in avgPSNR (~0.3) on RTC set. Small/negligible decrease on RTC_derf. Change-Id: I7af44235af514058ccf9a4f10bb737da9d720866
-
Marco authored
Refactor to split the 1 passs source sad computation into scene detection (currently used for VBR and screen-content mode), and superblock based source sad computation (used in non-rd CBR mode). This allows the source sad computation for CBR mode to be multi-threaded. No change in compression. Change-Id: I112f2918613ccbd37c1771d852606d3af18c1388
-
- 24 Mar, 2017 1 commit
-
-
Marco authored
Make the source_sad feature work properly for cases of VBR or screen_content with SVC. Added unittest for SVC with screen-content on. Change-Id: Iba5254fd8833fb11da521e00cc1317ec81d3f89b
-
- 23 Mar, 2017 1 commit
-
-
Marco authored
Since y_sad is not computed yet (on the early exit due to source_sad), no need to check for setting color_sensitiviy. Only affects speed >=8. No change in behavior. Change-Id: I3a6f2d20fed38d8b8ec51b75bcacf9a21f2db916
-
- 22 Mar, 2017 1 commit
-
-
Jerome Jiang authored
Change it to row based array to avoid the slow down cause by sync. row-mt on, speed 8, 2 threads: ~4% speedup for VGA on ARM benefited from adaptive_rd_threshold. Change-Id: I887e65a53af20a6c4f48d293daaee09dab3512cf
-
- 21 Mar, 2017 1 commit
-
-
Yunqing Wang authored
Computed the partition search early termination score in a separate function. Change-Id: I1894b517ff179a38b1c05e054d373ac4b7f4cbb4
-
- 20 Mar, 2017 3 commits
-
-
Marco authored
Add additional condition to split to 16x16, for resolutions <= 360p, reduces dragging artifact near moving boundary. Small/no change on RTC metrics. Change-Id: I314694f2166435d918f74e7ab42f002b07f40dae
-
Marco authored
For each superblock, keep track of how far from current frame was the last significant content change, and use that (along with GF distance), to turnoff GF search in non-rd pickmode. Only enabled for speed >= 8. avgPNSR on RTC/RTC_derf down by ~0.9/1.2. Speedup on mac: ~3-5%. Speedup on arm: 3.6% for VGA and 4.4% for HD. Change-Id: Ic3f3d6a2af650aca6ba0064d2b1db8d48c035ac7
-
Yunqing Wang authored
The sum of tx bloxk eobs is needed in the machine learning based partition early termination. The eobs are first accumulated during tx search, and then the value associated with the best tx_size is copied to ctx for later use. After the sum of eobs are calculated correctly, re-enabled ml_partition_search_early_termination speed feature. Re-did the quality/speed test to check the impact of the fix. 1. Borg test BDRATE result: 4k set: PSNR: +0.183%; SSIM: +0.100%; hdres set: PSNR: +0.168%; SSIM: +0.256%; midres set: PSNR: +0.186%; SSIM: +0.326%; 2.Average speed gain result: 4k clips: 21%; hd clips: 26%; midres clips: 15%. The result is in line with the original result. Change-Id: I4209a95c89be03b4cbfb6a95b16885f89feddbda
-
- 13 Mar, 2017 1 commit
-
-
Yunqing Wang authored
This patch was based on Yang Xian's intern project code. Further modifications were done. 1. Moved machine-learning related parameters into the context structure. 2. Corrected the calculation of sum_eobs. 3. Removed unused parameters and calculations. 4. Made it work with multiple tiles. 5. Added a speed feature for the machine-learning based partition search early termination. 6. Re-organized the code. The patch was rebased to the top-of-tree. Borg test BDRATE result: 4k set: PSNR: +0.144%; SSIM: +0.043%; hdres set: PSNR: +0.149%; SSIM: +0.269%; midres set: PSNR: +0.127%; SSIM: +0.257%; Average speed gain result: 4k clips: 22%; hd clips: 23%; midres clips: 15%. Change-Id: I0220e93a8277e6a7ea4b2c34b605966e3b1584ac
-
- 08 Mar, 2017 1 commit
-
-
Yunqing Wang authored
The 2 thresholds(i.e. partition_search_breakout_dist_thr and partition_search_breakout_rate_thr) are used as the partition search early termination speed feature. This refactoring patch made this feature to be frame size dependent consistently throughout the code. Change-Id: Idaa0bd8400badaa0f8e2091e3f41ed2544e71be9
-
- 02 Mar, 2017 1 commit
-
-
Vignesh Venkatasubramanian authored
Enable row level multithreading for realtime encodes where non-rd path is used (speed >= 5). Change-Id: I5439cb49a02171166d8e1de06c7d5e6f8e819a41
-
- 27 Feb, 2017 1 commit
-
-
Marco authored
From commit: https://chromium-review.googlesource.com/c/441393/ On non-segment the set_vbp_thresholds() should be called again to adjust thresholds based on content_state of superblock. This was the intended behavior from 441393. Small change in RTC metrics and speed. Change-Id: I45e5fbdc4af74db76b3cb4f13074fcae0eb2219e
-