- 19 Jul, 2013 - 3 commits
-
-
Dmitry Kovalev authored
Functions: vp9_get_pred_context_switchable_interp, vp9_get_pred_context_intra_inter, vp9_get_pred_context_single_ref_p1, vp9_get_pred_context_single_ref_p2. Change-Id: I3d6fb8aee23c9062270768e1e6da416dd9bb8f96
-
Paul Wilkins authored
Change-Id: I4032dd0442043543954dcb3724df974b7cc7e515
-
Ronald S. Bultje authored
We would skip the rectangular blocks for sub8x8 partitions because we would conclude that PARTITION_NONE was better than PARTITION_SPLIT, however, that conclusion was made before we actually really tested PARTITION_SPLIT. Change-Id: I8fa91e59894badc1d8cee3ba8a49e40ae4c4a489
-
- 18 Jul, 2013 - 9 commits
-
-
Dmitry Kovalev authored
Change-Id: Ide58a74d31ff948319445a6337d2c05e98720e34
-
Ronald S. Bultje authored
Change-Id: I96b8058f6dfecf8aa3e152cdcbfd7e10071fbbc9
-
Ronald S. Bultje authored
This prevents a duplicate memcpy of a 128-byte struct every time set_scale_factors() is called (which is a lot), thus leading to a decrease from 3.7 MB to 1.85 MB of struct copying per 64x64 block RD/partition loop. Overall, this decreases encoding time of the first 50 frames of bus @ 1500kbps (speed 0) from 1min5.9 to 1min4.9, i.e. about a 1.5% overall speedup. We can likely get more gains by removing the copy of the other struct (and replacing it with an indexing) as well. Change-Id: I3dceb7e79f71e6fe911b11cc994cf89a869dde7a
-
Ronald S. Bultje authored
This means we only do UV intra mode selection if we find any intra mode to actually be useful at all; in addition, we only do UV intra mode selection for the transform sizes that were selected, rather than all sizes available in this partition. First 50 frames of bus @ 1500kbps (speed 0) gains about 5% with this change. Change-Id: I7b461eb8b803247f57896c5a9505f745b55502b3
-
Ronald S. Bultje authored
The break statement only breaks out of the nested loop, not the top-level loop, so it doesn't always work as intended. Changing it to a return statement does what's intended. Change-Id: I585419823b39a04ec8826b1c8a216099b1728ba7
-
Ronald S. Bultje authored
The same information already exists in union b_mode_info. Change-Id: Iac5086b99a3c3cc270380138062bb693e58f9e6d
-
Ronald S. Bultje authored
This could happen during golden overlay frame coding from a previous alt-ref frame if the special overlay code was triggered. Change-Id: I3056d0c547cd26903b260ef93c94026e96bd9868
-
Ronald S. Bultje authored
Change-Id: Id4f454831f3f11099f39c30246adeaa52857d08d
-
Jingning Han authored
Make the use of mv_check_bounds consistent for mvs of both ref_frame[0] and ref_frame[1]. Change-Id: I1ca24865cc7232ca9cbe5db566c53abad1592211
-
- 17 Jul, 2013 - 14 commits
-
-
Dmitry Kovalev authored
These arrays have constant values (no any updates). Removing two corresponding memcpy calls. Making a little cleanup in vp9_entropymode.h as well: removing redundant 'extern' keyword and moving all function declarations at the end. Change-Id: Ia16b38b46aec2e2500f5df29c40a297ae241dede
-
Yunqing Wang authored
vp9_init_quantizer() is called in vp9_create_compressor(), and should not be called in vp9_set_speed_features(). Change-Id: Ic2f1f4b0531b9d46bb841d7e1d8da9812207dad6
-
Ronald S. Bultje authored
Encoding of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min6.2 to 1min5.9, i.e. 0.5% faster overall. Change-Id: I59d8a3b2f0a75010fa041d5e2646c8caac5bd683
-
Ronald S. Bultje authored
Encode of first 50 frames of bus @ 1500kbps (speed 0) goes from 1min7.3 to 1min6.2, i.e. 1.7% faster overall. Change-Id: I19d2deacfbffadd61d32551cee9586757ab4a987
-
Yaowu Xu authored
Change-Id: Ic4c4b363ed840935e42f495f13ea5e601a56f1b2
-
Ronald S. Bultje authored
Encode of first 50 frames of bus @ 1500kbps (speed 0) goes from 1min12.8 to 1min7.3, i.e. 8% faster. Change-Id: Ia22d1c7b687316c553cc60eacae988b24e175b62
-
Yunqing Wang authored
Added disable_splitmv feature at other speed levels. For speed 3 or above, always turn it on. Change-Id: Ibb36f0a7ef12a34b4f8d0f9cb6193eab43b34360
-
Ronald S. Bultje authored
About 15% faster for bus (speed 0) first 50 frames @ 1500kbps, which goes from 1min36 to 1min24. Results become slightly better (+0.2% on derf/yt, +0.4% on hd), probably because of a bugfix for skipmode in super_block_yrd(). Overall speed change (on derfraw300) is roughly -13%. This can probably be improved further by caching best_yrd between partition searches. Also, we might be able to get more speedups by always doing PARTITION_NONE before PARTITIONS_SPLIT, not just at the sb8x8 level. Change-Id: I83736949ebd5b4a3b400ee688d7661913fefc98b
-
Ronald S. Bultje authored
+0.2% SSIM and glbPSNR on derfraw300. Change-Id: I9cba0bca55e606a22f557c7732b064f738efe84d
-
Yunqing Wang authored
Current partition checking starts from small sizes, and then goes up to large sizes. This experiment uses the small partitions' motion estimation result, which is already available, to speed up the large partition's motion estimation. We can decide to skip some patition checkings if they are unlikely choices. We could use the motion vector(MV) result as current partition's prediction MV, limit the search range and reference frame. Current result at speed 1: psnr loss: 1.19% for stdhd, 0.287% for derf. speed gain: 14% for sunflower(hd), 11% for akiyo. Further improvement will be done later. Change-Id: I5abfd070e9cace2e91e2a0247d1325df313887ab
-
Paul Wilkins authored
Use an estimate based on DC_PRED for intra uv cost within the rd loop then only do a full uv mode analysis if an intra mode is chosen. Significant speed gains in some cases. Currently only enabled for speed 2 pending speed/quality tests. Change-Id: Ie851a12400d5483bce47ec0e3ccb8516041e91c0
-
Paul Wilkins authored
Apply limit if search_method == USE_LARGESTALL to the range of UV tx sizes searched. Change-Id: I6db29f0dd237285ffc50d75a37e8b68151ad821c
-
Jingning Han authored
This commit makes the encoder to perform motion search only once per reference frame type for each 4x4/4x8/8x4 block. For bus_cif at 2000 kbps, the runtime goes from 253812ms -> 217817ms (14% speed-up) for speed 0. Change-Id: I5f17599ccc8cfaf93ccb4f98fcb6008af6d79e92
-
Dmitry Kovalev authored
Change-Id: Ieffea49eb7a5e5092f21f8694c546aff69b07c6d
-
- 16 Jul, 2013 - 9 commits
-
-
Dmitry Kovalev authored
Removing VP9_COMMON* argument and adding struct tx_probs* instead of MACROBLOCKD*. Change-Id: Idf61074631a90ec51eac22c8dcd977f44ac0757c
-
Dmitry Kovalev authored
Change-Id: I4884cdc2557d25d50c7c4f7e19b1ad8bdb93cd63
-
Dmitry Kovalev authored
Removing tile_rows and tile_columns from VP9Common, removing redundant constants MIN_TILE_WIDTH and MAX_TILE_WIDTH, changing signature of vp9_get_tile_n_bits. Change-Id: I8ff3104a38179b2c6900df965c144c1d6f602267
-
James Zern authored
s/frame_rate/framerate/g Change-Id: I6fc3e088e419c5f46e3a9390dd8a2cad2677a2fc
-
Dmitry Kovalev authored
Making implementation of vp9_set_pred_flag_{seg_id, mbskip} consistent with vp9_get_segment_id without using confusing sub(a, b) macro. Passing mi_row and mi_col to functions explicitly instead of replying on mb_to_right_edge and mb_to_bottom_edge. Change-Id: I54c1087dd2ba9036f8ba7eb165b073e807d00435
-
Paul Wilkins authored
Change-Id: I94b97a966b5efbc9a243048f1f5ddbbdc4b1846e
-
Yaowu Xu authored
This is a short term optimization till we work out a decoder implementation requiring no frame border extension. Change-Id: I02d15bfde4d926b50a4e58b393d8c4062d1be70f
-
Dmitry Kovalev authored
Removing unused and duplicated constants, moving them from *.h to *.c if possible. Change-Id: Ief4d6b984a3ca2e9b38504f0d855ed072cf7133f
-
Ronald S. Bultje authored
Cycle times: 4x4: 151 to 131 cycles (15% faster) 8x8: 334 to 306 cycles (9% faster) 16x16: 1401 to 1368 cycles (2.5% faster) 32x32: 7403 to 7367 cycles (0.5% faster) Total encode time of first 50 frames of bus @ 1500kbps (speed 0) goes from 1min39.2 to 1min38.6, i.e. a 0.67% overall speedup. Change-Id: I799a49460e5e3fcab01725564dd49c629bfe935f
-
- 15 Jul, 2013 - 3 commits
-
-
Ronald S. Bultje authored
Also inline some of the block calculations to assist the compiler to not do silly things like calculating the same offset (or converting between raster/transform block offset or block, mi and pixel unit) many, many, many times. Cycle times: 4x4: 584 -> 505 cycles (16% faster) 8x8: 1651 -> 1560 cycles (6% faster) 16x16: 7897 -> 7704 cycles (2.5% faster) 32x32: 16096 -> 15852 cycles (1.5% faster) Overall, this saves about 0.5 seconds (1min49.8 -> 1min49.3) on the first 50 frames of bus (speed 0) @ 1500kbps, i.e. 0.5% overall. Change-Id: If3dd62453f8e2ab9d4ee616bc4ea956fb8874b80
-
Jingning Han authored
Skip the inverse transform and reconstruction of inter-mode coded blocks in the rate-distortion optimization loop, when skip_encode_sb feature is turned on. This provides about 1% speed-up at speed 0, and 1.5% speed-up at speed 1. No performance change in both settings. Change-Id: I2932718bf4d007163702b61b16b6ff100cf9d007
-
Jingning Han authored
This speed feature allows the encoder to largely remove the spatial dependency between blocks inside a 64x64 superblock, thereby removing the need to repeatedly encode superblocks per partition type in the rate-distortion optimization loop. A major challenge lies in the intra modes tested in the rate-distortion optimization loop. The subsequent blocks do not have access to the reconstructed boundary pixels without the intermediate coding steps. This was resolved by using the original pixels for intra prediction in the rd loop, followed by an appropriately designed distortion modeling on the quantization parameters. Experiments also suggested that the performance impact is more discernible at lower bit-rate/psnr settings. Hence a quantizer dependent threshold is applied to deactivate skip of block coding. For bus_cif at 2000 kbps, speed 0: runtime 269854ms -> 237774ms (12% speed-up) at 0.05dB performance loss. speed 1: runtime 65312ms -> 61536ms, (7...
-
- 14 Jul, 2013 - 1 commit
-
-
James Zern authored
frames_since_golden / frames_till_alt_ref_frame are unused. Change-Id: I348e7689d4d75412cf4de7703d885be942e4a26b
-
- 13 Jul, 2013 - 1 commit
-
-
Dmitry Kovalev authored
Change-Id: Id9b6ceeddca3f9b34bfada5c499b1e7a2f42c30b
-