- 29 Aug, 2017 2 commits
-
-
Scott LaVarnway authored
-
Scott LaVarnway authored
C vs SSE2 speed gains: _4x4 : ~8.12x _8x8 : ~9.71x _16x16 : ~8.21x _32x32 : ~5.0x BUG=webm:1422 Change-Id: I5e8a1ed4db7b8dc539b3e2a728b0b34d8b4b1993
-
- 28 Aug, 2017 1 commit
-
-
Jerome Jiang authored
Rev d1477715 fixed the test failure. So remove the resolution condition for using source_sad in speed 6. BUG=webm:1452 Change-Id: I1efba97e1ef5bd4de5f886299f6fcb907187abcd
-
- 25 Aug, 2017 7 commits
-
-
Marco Paniconi authored
-
Marco Paniconi authored
-
Marco authored
Enable adapt_partition for vbr mode for speed 6. This allows the usage of the pickmode-based partition (used in speed 5), but only selectively for superblocks with high source sad, otherwise the faster variance based partition scheme is used. For speed 6 on ytlive set: avgPSNR/SSIM metrics up by ~0.6%, several clips up by ~1.5%. Small/negligible decrease in speed. Change-Id: I12f3efef6b3e059391de330fdbe5a44c2587f1f8
-
Marco Paniconi authored
-
Marco authored
For SVC at speed >= 7: only use the improved mv search on base spatial layer, if top layer resolution is above 640x360. ~2.3% speedup Small/negligible loss in avgPSNR metrics on rtc set. Change-Id: Iaef75a57ebf1c248931bc1aa28d20b7fecac1851
-
Marco Paniconi authored
This reverts commit f60d1dcd. Reason for revert: <INSERT REASONING HERE> Failures in AVX/VP9QuantizeTest in nightly tests. Original change's description: > quantize avx: copy 32x32 implementation > > Ensure avx and ssse3 stay in sync by testing them against each other. > > Change-Id: I699f3b48785c83260825402d7826231f475f697c TBR=slavarnway@google.com,johannkoenig@google.com,builds@webmproject.org Change-Id: Ibd38636212269328317dd0721be9d25452113d1c No-Presubmit: true No-Tree-Checks: true No-Try: true
-
Shiyou Yin authored
Merge "vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_pixel_avg_varianceWxH_c with mmi."
-
- 24 Aug, 2017 11 commits
-
-
Marco Paniconi authored
-
Tom Finegan authored
This avoids an endless build loop at vpx_version.h creation time when diff is not present. Change-Id: I16ae386dbdaf14f9a2b85e4c5d1aaa6c08f52a45
-
Johann Koenig authored
-
Shiyou Yin authored
vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_pixel_avg_varianceWxH_c with mmi. Change-Id: Ia576a721df6312329b599c31cfe1fb1267a9f174
-
Marco authored
For speeds < 7, increase threshold that controls the split of 16x16->8x8 blocks, for resolutions 720p and higher. Minor change for speed 5 (since it uses reference partition scheme which only uses variance partition as first step). For speed 6: ~0.5% increase in avgPSNR/SSIM metrics on ytlvie set. No change in speed. Change-Id: I5126580973201538d8ca26a9256b93c4d11d685b
-
Johann Koenig authored
-
Johann authored
Ensure avx and ssse3 stay in sync by testing them against each other. Change-Id: I699f3b48785c83260825402d7826231f475f697c
-
Johann authored
Still does not pass tests. Does match the previous assembly, although saving the sign before multiplying is dubious. Change-Id: Ia163f18c755aba542d6e93f7bf7343184660df5a
-
Johann authored
Change-Id: I1d93698bc27529b0544d79dd7b9fe37afa51ef87
-
Johann Koenig authored
-
Shiyou Yin authored
-
- 23 Aug, 2017 10 commits
-
-
Marco Paniconi authored
-
Johann authored
Change-Id: I77be617c7d7c64929dd51c6077322f4f8ad23897
-
Johann Koenig authored
-
Marco authored
For SVC encoding: average speedup ~1.5%, with small ~0.57 loss in avgPSNR metrics. Change-Id: Icebce6f6ef4e819d7dfcf8db898c583167351de4
-
Scott LaVarnway authored
-
Johann Koenig authored
-
Johann authored
Adds an early exit based on ptest. Slightly slower than ssse3 in the full case because of the extra check, but potentially faster if lots of rows can be skipped. Very close in speed to the assembly. Can run in 32 bit, unlike the assembly. Allows reworking the function prototype to use structs. Change-Id: If80e2b9ba059370a4cad3c973196e82a97b4330e
-
Johann authored
Add 1 if negative to get dqcoeff to round towards zero. 10-15% faster than converting to positive before shifting. Change-Id: I01a62fd0c9bca786b6885b318bd447bb9229903d
-
Johann authored
About 4x faster when values are below the dequant threshold and 10x faster if everything needs to be calculated. Both numbers would improve if the division for dqcoeff could be simplified. BUG=webm:1426 Change-Id: I8da67c1f3fcb4abed8751990c1afe00bc841f4b2
-
Shiyou Yin authored
Change-Id: I2c782d18d9004414ba61b77238e0caf3e022d8f2
-
- 22 Aug, 2017 9 commits
-
-
Marco Paniconi authored
-
Johann Koenig authored
* changes: quantize ssse3: copy style from sse2 quantize sse2: copy opts from ssse3
-
Marco authored
This feature is used for the CBR RTC encoding mode at speed >= 6. This change will exclude it for VBR mode. For speed 6 live encoding (VBR): avgPSNR/SSIM metrics on ytlive set up by ~1% (few clips up by 2/3%). No change in speed. Change-Id: I1a0dd94c334f7df309ab5a48d477d7e25355b798
-
Johann authored
Change-Id: I53f8a160e640c674ea035fc112e207b6dca42598
-
Johann Koenig authored
-
Johann authored
Simplify eob calculations based on ssse3 implementation. General clean up and re-scoping. Change-Id: I48f282bf9bd28ee9bc2c7a6779be9d45b5a3a3ee
-
Johann Koenig authored
* changes: quantize: ignore skip_block in arm quantize: ignore skip_block in x86 quantize fp: ignore skip_block in arm quantize fp: ignore skip_block in x86
-
Johann authored
This should probably be handled before vp9_regular_quantize_b_4x4 even gets called. Fixes an assert resulting from removing skip_block from the quantize functions. BUG=webm:1459 Change-Id: I7f52b53f959b4654b3d4517ebda31a678f4d0fde
-
James Zern authored
-