- 05 Aug, 2014 - 1 commit
-
-
Pengchong Jin authored
This patch allows the encoder to directly split the block in partition search, therefore skip searching NONE. It computes a score which measures whether 16x16 motion vectors from the first pass in the current block are consistent with each others. If they are inconsistent and we have enough Q to encode, split the block directly, and skip searching NONE. This feature is under flag CONFIG_FP_MB_STATS. In speed 2, it further gives a speedup of 3-8% on sample yt clips as compared to the previous version under the same flag. Overall, the features under the flag will give 7-15% on typical yt clips at up to 6000kbps data rate. The speedup at very high data rate is not significant. For hard stdhd clips: park_joy_1080p @ 15000kbps: 504541ms -> 506293ms (-0.35%) pedestrian_area_1080p @ 2000kbps: 326610ms -> 290090ms (+11.2%) The compression performance using the features under the flag: derf: -0.068% yt: -0.189% hd: -0.318% stdhd:-0.183% To use the feature, set CONFIG_FP_MB_STATS and turn on cpi->use_fp_mb_stats. Change-Id: Iad58a2966515c8861aa9eb211565b1864048d47f
-
- 04 Aug, 2014 - 2 commits
-
-
Pengchong Jin authored
Re-organize the one-byte structure for 16x16 first pass block. Add bits to indicate motion vector directions. Change-Id: Id10754ba343dfc712c7fed5bcc85c67fa0bbcb89
-
Jim Bankoski authored
Change-Id: I1b3c5337f018dde27dc819ab18bd081d169a91e8
-
- 31 Jul, 2014 - 1 commit
-
-
Jingning Han authored
The mb_energy value is used by aq-mode. Turn off computing its value when aq-mode is off. Change-Id: I26c239f124eca45a5ee58b90d19eae00d9a7cda5
-
- 30 Jul, 2014 - 3 commits
-
-
Pengchong Jin authored
This patch allows the encoder to skip the search for partition SPLIT, HORZ, VERT after the search for partition NONE is done in RD optimization. It uses the first pass block-wise statistics to make the decision. If all 16x16 blocks in the current partition have zero motions and small residues from the frist pass statistics, and it has small difference variance, further partition search is skipped. For speed 2 setting, experiments on general youtube clips show that the speedup varies from 1% - 10%, 5% on average. On the performance side in PSNR, derf 0.004%, yt -0.059%, hd -0.106%, stdhd 0.032%. For hard stdhd clips: park_joy_1080p, 502952 ms -> 503307 ms (-0.07%) pedestrian_area_1080p, 227049 ms -> 220531 ms (+3%) This feature is under the compilation flag CONFIG_FP_MB_STATS and it is off in current setting. Change-Id: I554537e9242178263b65ebe14a04f9c221b58bae
-
Jingning Han authored
Remove the variable that indicates the relative block index. This is explicitly covered by the use of pc_tree. Change-Id: Ib13142582fff926c85e375bde656aa050add8350
-
Jingning Han authored
This commit enables a chessboard pattern constrained partition search for 720p and above resolutions. The scheme applies stricter partition search to alternative blocks based on its above/left neighboring blocks' partition range, as well as that of the collocated blocks in the previous frame. It is currently turned on at 16x16 block size level. The chessboard pattern is flipped per coding frame. The speed 3 runtime is reduced: park_joy_1080p, 652832 ms -> 607738 ms (7% speed-up) pedestrian_area_1080p, 215998 ms -> 200589 ms (8% speed-up) The compression performance is changed: hd -0.223% stdhd -0.295% Change-Id: I2d4d123ae89f7171562f618febb4d81789575b19
-
- 29 Jul, 2014 - 2 commits
-
-
Jingning Han authored
This commit replace the repetitive retrieve of max and min allowed partition from speed_feature with local variables max_size and min_size. Change-Id: Ib06f11f16615e4876e4dd5fb6a968c6bf5f7b216
-
Jingning Han authored
The get_chessboard_index() used to call the entire VP9_COMMON struct pointer to retrieve the chessboard pattern index. This cl makes it call the frame index directly. Change-Id: I3cad9d209ea2e77a358085a04fe1ff0ddec5ba03
-
- 25 Jul, 2014 - 1 commit
-
-
Jingning Han authored
The partition search for 4x4 blocks takes unnecessary steps to reconstruct pixels and an extra partition type update. This commit removes such operations. No visible compression/speed difference. Thanks to Yue (yuec@) for finding this issue. Change-Id: I3f83824aa3fd3717d63be0b280fa57258939a70a
-
- 24 Jul, 2014 - 1 commit
-
-
Tim Kopp authored
This should prevent confusion with the VP8 CONFIG_TEMPORAL_DENOISING and other flags. Change-Id: I1fe4e2977895b7966841d861ab74317ad875b6c8
-
- 22 Jul, 2014 - 1 commit
-
-
Adrian Grange authored
Fixed the function get_frame_type to return the correct frame type for golden and last frames. Change-Id: I8edddd9aa26cbe7a1de8ff211389410b22b1bd14
-
- 21 Jul, 2014 - 2 commits
-
-
Alex Converse authored
Change-Id: Ic5d3a3a0dac10b49495771886a31e793bb78b5ca
-
Yunqing Wang authored
For gcc, when libvpx config option debug is disabled, added the flag -DNDEBUG to disable the assertions in libvpx for some speedup. Change-Id: Ifcb7b9e8ef5cbe5d07a24407b53b9a2923f596ee
-
- 17 Jul, 2014 - 1 commit
-
-
Pengchong Jin authored
The bug sets the wrong pointer to the first pass mb stats if the encoder does the re-coding in the second pass. Change-Id: I8a11f45dd7dceb38de814adec24cecccae370d00
-
- 15 Jul, 2014 - 1 commit
-
-
Tim Kopp authored
In vp8, statistics are collected about the different modes as they are searched. This process is more complicated due to the variable block size. Fields were added to the PICM_MODE_CONTEXT struct to hold this information for each point in the search. The information is then taken from the appropriate part of the tree during denoising. Change-Id: I89261ab77ad637821287ae157dfdf694702b8e77
-
- 11 Jul, 2014 - 1 commit
-
-
Yunqing Wang authored
Changed to use defined inline functions consistently through the code. Change-Id: I7644d24fa7a837378564a6e0790416d3725dd200
-
- 07 Jul, 2014 - 1 commit
-
-
Jingning Han authored
Change-Id: Id6eedc502c86433df1456dd994aee6bc9a1359a2
-
- 02 Jul, 2014 - 2 commits
-
-
Alex Converse authored
vp9_rdopt is for making rd optimal mode decisions. vp9_rd is for all other rd related routines. Anything used outside of making an rd optimal decision belongs in rd. Change-Id: I772a3073f7588bdf139f551fb9810b6864d8e64b
-
Jingning Han authored
This commit re-designs the quantization process for transform coefficient blocks of size 4x4 to 16x16. It improves compression performance for speed 7 by 3.85%. The SSSE3 version for the new quantization process is included. The average runtime of the 8x8 block quantization is reduced from 285 cycles -> 255 cycles, i.e., over 10% faster. Change-Id: I61278aa02efc70599b962d3314671db5b0446a50
-
- 01 Jul, 2014 - 1 commit
-
-
Yunqing Wang authored
Fixed the signed/unsigned mismatch. Change-Id: Id83d603b8f1745b71f4cf695a0751e55518b1316
-
- 30 Jun, 2014 - 3 commits
-
-
Yaowu Xu authored
Encoder still uses SWITCHABLE as default via DEFAULT_INTERP_FILTER, but does not override the default if it is not SWITCHABLE. Change-Id: I3c0f6653bd228381a623a026c66599b0a87d01d5
-
Jingning Han authored
When the frame is intra coded only, the encoder takes the RD coding flow. Hence the function set_mode_info is not practically in use. This commit removes it and the associated conditional branches. Change-Id: I1e42659ceb55b771ba712d1cdecacb446aa6460d
-
Yunqing Wang authored
Before encoding a frame, calculate and store each 16x16 block's variance of source difference between last and current frame. Find partitioning threshold T for the frame from its variance histogram, and then use T to make partition decisions. Comparing with fixed 16x16 partitioning, rtc set test showed an overall psnr gain of 3.242%, and ssim gain of 3.751%. The best psnr gain is 8.653%. The overall encoding speed didn't change much. It got faster for some clips(for example, 12% speedup for vidyo1), and a little slower for others. Also, a minor modification was made in datarate unit test. Change-Id: Ie290743aa3814e83607b93831b667a2a49d0932c
-
- 29 Jun, 2014 - 1 commit
-
-
Jim Bankoski authored
Change-Id: I7f989d197444d166133ad91eb23ac1033109f58d
-
- 26 Jun, 2014 - 2 commits
-
-
Jingning Han authored
This commit enables an adaptive transform size selection method for speed -6. It uses largest transform size when the sse is more than 4 times of variance, i.e., most energy is compacted in the DC coefficient. Otherwise, use the default TX_8X8. It improves the compression efficiency for rtc set of speed -6 by 0.8%, no speed change observed. Change-Id: Ie6ed1e728ff7bf88ebe940a60811361cdd19969c
-
Pengchong Jin authored
This patch allows the encoder to skip the partition search for the frame if it is an inter frame and only zero motion vectors have been detected in the first pass. The partition size is directly assigned according to the difference variance. Borg tests show overall little performance changes in term of PSNR (derf -0.027%, yt 0.152%, hd 0.078%, stdhd 0%). The worst case of PSNR loss is -0.514% from yt. The best PSNR gain is 4.293% from yt. The second pass encoding speedup for slideshow clips is 15%-40%. Change-Id: I881f347d286553ee5594a9ea09ba1a61ac684045
-
- 24 Jun, 2014 - 2 commits
-
-
Yunqing Wang authored
In real-time speed 6, no partition search is done. The inter prediction results got from picking mode can be reused in the following encoding process. A speed feature reuse_inter_pred_sby is added to only enable the resue in speed 6. This patch doesn't change encoding result. RTC set tests showed that the encoding speed gain is 2% - 5%. Change-Id: I3884780f64ef95dd8be10562926542528713b92c
-
Paul Wilkins authored
Fix some bugs relating to the use of buffers in the overlay frames. Fix bug where a mid sequence overlay was propagating large partition and transform sizes into the subsequent frame because of :- sf->last_partitioning_redo_frequency > 1 and sf->tx_size_search_method == USE_LARGESTALL Change-Id: Ibf9ef39a5a5150f8cbdd2c9275abb0316c67873a
-
- 20 Jun, 2014 - 3 commits
-
-
Alex Converse authored
Change-Id: Ibb841a1fa4d08d164cf5461246ec290f582b1f80
-
Alex Converse authored
Change-Id: I549868725b789f0f4f89828005a65972c20df888
-
Alex Converse authored
Copy split from macroblock to pick mode context so it doesn't get lost. Change-Id: Ie37aa12558dbe65c4f8076cf808250fffb7f27a8
-
- 12 Jun, 2014 - 3 commits
-
-
Dmitry Kovalev authored
Change-Id: Ifa6374e9db5919322733b656e0865f5f19ee6f2c
-
Jingning Han authored
This commit enables a fast path computational flow for forward transformation. It checks the sse and variance of prediction residuals and decides if the quantized coefficients are all zero, dc only, or more. It then selects the corresponding coding path in the forward transformation and quantization stage. It is currently enabled in rtc coding mode. Will do it for rd coding mode next. In speed -6, the runtime for pedestrian_area 1080p at 1000 kbps goes down from 14234 ms to 13704 ms, i.e., about 4% speed-up. Overall coding performance for rtc set is changed by -0.18%. Change-Id: I0452da1786d59bc8bcbe0a35fdae9f623d1d44e1
-
Alex Converse authored
Add a set_mode_info_seg_skip function that fills the requisite mode info. Change-Id: I460b1b6845d720d9b09ed5b64df0ea0aac443f62
-
- 09 Jun, 2014 - 1 commit
-
-
Yunqing Wang authored
In non-rd real-time mode, choosing smaller transform size in encoding gives better video quality and good speed gain than choosing larger transform size. This patch set tx size search method to ALLOW_8X8, which is better than using 4x4 or other larger sizes. Borg tests on rtc set at speed 6 showed significant gain on quality. PSNR gain: 11.034% and SSIM gain: 15.466%. The speed gain is 5% - 12% for <720p clips, and 2% - 7% for 720p clips. Change-Id: If4dc74ed2df359346b059f47fb73b4a0193ec548
-
- 06 Jun, 2014 - 2 commits
-
-
Dmitry Kovalev authored
This is not a speed feature, adding inline function instead. Change-Id: Ia48c41802eec9e92cf990339d724097279695c9a
-
Dmitry Kovalev authored
Change-Id: Ib8187c8f2556e1e9268b0683cd2b6ff3489f0205
-
- 05 Jun, 2014 - 2 commits
-
-
Dmitry Kovalev authored
Change-Id: Ifcb46e6904730d14b9ef76b648b4d0dc3cd5d0c5
-
Dmitry Kovalev authored
The same enum defined and used in vp9_mvref_common.c. Change-Id: I3975103997797add0a258d36c96d20ac9561a73d
-