- 29 Jul, 2013 - 3 commits
-
-
Dmitry Kovalev authored
Change-Id: I10bf06e3a3d5271221ae6a42a36074d01d493039
-
Dmitry Kovalev authored
Change-Id: I6aa4191935aa93461a07c41b59fdae1eb5f5f107
-
Jingning Han authored
-
- 27 Jul, 2013 - 7 commits
-
-
Dmitry Kovalev authored
Change-Id: I5a3e83102784cabb918a5404405fcab99c5bb9b6
-
Ronald S. Bultje authored
This allows us to increment the position at the band-level only as we go from one band to the next; more importantly, that allows us to use an add instead of multiply instruction, and omit the instruction altogether if the band doesn't change from one coef to the next, thus being slightly faster (probably more noticeable on systems where a multiply is expensive, like arm). Change-Id: I4343fe35b9f9a47fa00b217bdcbf5f91ff96c381
-
Dmitry Kovalev authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
Jingning Han authored
This commit brought back the shortcut implementation of 8x8/16x16 inverse 2D-DCT. When the eob <= 10, it skips the inverse transform operations on row 4:7/4:15 in the first round. For bus_cif at 1000 kbps, this provides about 2% speed-up at speed 0. Change-Id: I453e2d72956467d75be4ad8c04b4482ab889d572
-
- 26 Jul, 2013 - 11 commits
-
-
Dmitry Kovalev authored
Renaming: read_intra_mode_info -> read_intra_frame_mode_info read_inter_mode_info -> read_inter_frame_mode_info read_intra_block_part -> read_intra_block_mode_info read_inter_block_part -> read_inter_block_mode_info read_ref_frame -> read_ref_frames read_reference_frame -> read_is_inter_block Using num_4x4_blocks_{wide, high}_lookup instead of bit shifts. Change-Id: I83c81573b4ef6f53f2f8d24683895014bebfba61
-
Jingning Han authored
-
Dmitry Kovalev authored
-
hkuang authored
-
Jingning Han authored
This commit enables a special handle for the 8x8 inverse 2D-DCT, where only DC coefficient is quantized to be non-zero. For bus_cif at 2000 kbps, it provides about 1% speed-up at speed 0. Change-Id: I2523222359eec26b144cf8fd4c63a4ad63b1b011
-
hkuang authored
Change-Id: I748dee8938dfb19f417f24eed005f3d216f83a82
-
Dmitry Kovalev authored
-
Ronald S. Bultje authored
Change-Id: Ie48035ff4f93c41f8a9b3023e6444fd10432d8fb
-
Yaowu Xu authored
-
Paul Wilkins authored
Speed feature experiment to set an upper and lower partition size limit based on what has been seen in spatial neighbors. This seems to gives quite reasonable speed gains in local (10-15%) and when used with speed 0 the losses are small (0.25% derf, 0.35% stdhd). However, for now I am only enabling it on speed 1 as there may be clashes with the existing temporal partition selection in speed 2. Using a tighter min / max around the range derived from the neighbors increases speed further but at the cost of a bigger quality loss. However, I think this spatial method could be combined with data from either the last frame or a variance method (or both) to refine the range of minimum and maximum partition size. I.e. consider the min and max from spatial and temporal neighbors and the variance recommendation. Change-Id: I1b96bf8b84368d6aad0c7aa600fe141b4f07435f
-
Yunqing Wang authored
Used 3 * standard_deviation in internal threshold calculation instead of fit curve. This actually approached the algorithm better. For comparison, similar tests were done: The overall psnr loss is less than before. 1. derf set: when static-thresh = 1, psnr loss is 0.329%; when static-thresh = 500, psnr loss is 0.970%; 2. stdhd set: when static-thresh = 1, psnr loss is 0.922%; when static-thresh = 500, psnr loss is 1.307%; Similar speedup is achieved. For example, clip bitrate static-thresh psnr time akiyo(cif) 500 0 48.952 5.077s(50f) akiyo 500 500 48.866 4.169s(50f) parkjoy(1080p) 4000 0 30.388 78.20s(30f) parkjoy 4000 500 30.367 70.85s(30f) sunflower(1080p) 4000 0 44.402 74.55s(30f) sunflower 4000 500 44.414 68.69s(30f) Change-Id: Ic78833642ce1911dbbd1cb6c899a2d7e2dfcc1f3
-
- 25 Jul, 2013 - 19 commits
-
-
Dmitry Kovalev authored
Now read_inter_mode_info calls read_intra_block_part (renamed from read_intra_block_modes) or read_inter_block_part (just added). Change-Id: I541badea6b663e0ae692ec158665efb90ed20c03
-
Johann authored
-
Yunqing Wang authored
-
Yunqing Wang authored
This option exists in VP8, and it was rewritten in VP9 to support skipping on different partition levels. After prediction is done, we can check if the residuals in the partition block will be all quantized to 0. If this is true, the skip flag is set, and only prediction data are needed in reconstruction. Based on DCT's energy conservation property, the skipping check can be estimated in spatial domain. The prediction error is calculated and compared to a threshold. The threshold is determined by the dequant values, and also adjusted by partition sizes. To be precise, the DC and AC parts for Y, U, and V planes are checked to decide skipping or not. Test showed that 1. derf set: when static-thresh = 1, psnr loss is 0.666%; when static-thresh = 500, psnr loss is 1.162%; 2. stdhd set: when static-thresh = 1, psnr loss is 1.249%; when static-thresh = 500, psnr loss is 1.668%; For different clips, encoding speedup range is between several percentage and 20+% when static-thresh <= 500. For example, clip bitrate static-thresh psnr time akiyo(cif) 500 0 48.923 5.635s(50f) akiyo 500 500 48.863 4.402s(50f) parkjoy(1080p) 4000 0 30.380 77.54s(30f) parkjoy 4000 500 30.384 69.59s(30f) sunflower(1080p) 4000 0 44.461 85.2s(30f) sunflower 4000 500 44.418 78.1s(30f) Higher static-thresh values give larger speedup with larger quality loss. Change-Id: I857031ceb466ff314ab580ac5ec5d18542203c53
-
Johann authored
Change-Id: I0625d8ffddf590dfecd1bb8b8d6f57ef64b8bf18
-
Dmitry Kovalev authored
Removing unused constants, macros, and function declarations. Using ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving #include from *.h to *.c. Merging for loops for motion vectors. Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
Change-Id: Ia6144d77ebed66e0739b62e4d673e26a95aa9550
-
Adrian Grange authored
-
Adrian Grange authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Jingning Han authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Yaowu Xu authored
-
Adrian Grange authored
Simplified the code that extracts and uses the motion vectors for the 4 sub-partitions in rd_pick_partition. Change-Id: Iaf698ef7ee3aef9edd59015e1ae065dd359b17d9
-
James Zern authored
-
Jingning Han authored
This commit makes the initialization of trellis coeff optimization a per-plane operation, thereby eliminating the redundant steps in encode_sby and encode_sbuv. It makes the encoder at speed 0 slightly faster. Change-Id: Iffe9faca6a109dafc0dd69dc7273cbdec19b17cd
-