Commits · a4c94a94ccefb744d7db9a898872e3c9341ae12d · BC / public / external / libvpx

10 Jan, 2014 - 4 commits
- Merge "Optimze inv 16x16 DCT with 10 non-zero coeffs - P2" · a4c94a94
  Jingning Han authored 11 years ago
  
  a4c94a94
- Merge "Optimze inv 16x16 DCT with 10 non-zero coeffs - P1" · faa2ba86
  Jingning Han authored 11 years ago
  
  faa2ba86
- Merge "Cleanups on refresh flags" · 36c8daed
  Deb Mukherjee authored 11 years ago
  
  36c8daed
- Cleanups on refresh flags · 412e4954
  Deb Mukherjee authored 11 years ago
```
Cleanups on frame refresh flags and external overrides.

Change-Id: Ia6a56fe1bde906b1dc3fcbf4ef1c7b207cd2df2d
```
  412e4954
09 Jan, 2014 - 18 commits
- Merge "Use the correct member for initialization" · e8192cf6
  Johann authored 11 years ago
  
  e8192cf6
- Merge "Simplify set_rt_speed_feature()" · b1d81e19
  Yaowu Xu authored 11 years ago
  
  b1d81e19
- Merge "Renaming 'Sharpness' to 'sharpness'." · c8e8d3a4
  Dmitry Kovalev authored 11 years ago
  
  c8e8d3a4
- Simplify set_rt_speed_feature() · 2d381d76
  Yaowu Xu authored 11 years ago
```
1. Made speed choices to be progressive
2. Adjusted rt speed settings to achieve better speed/quality

Overall, rt-5 gained 2.5% in compression/quality, encoding time of 720p
niklas clip goes from 137,052ms to 121,874ms

Change-Id: Ia6e7e1e15225395a868a2f1059c3db8e266e1600
```
  2d381d76
- Optimze inv 16x16 DCT with 10 non-zero coeffs - P2 · af31b27a
  Jingning Han authored 11 years ago
```
This commit further optimizes SSE2 operations in the second 1-D
inverse 16x16 DCT, with (<10) non-zero coefficients. The average
runtime of this module goes down from 779 cycles -> 725 cycles.

Change-Id: Iac31b123640d9b1e8f906e770702936b71f0ba7f
```
  af31b27a
- Merge "SSSE3 convolution optimization" · f3b9b97c
  Yunqing Wang authored 11 years ago
  
  f3b9b97c
- SSSE3 convolution optimization · 511d218c
  levytamar82 authored 11 years ago
```
Optimizing all SSSE3 assembly for convolution:
1. vp9_filter_block1d4_h8_sse2
2. vp9_filter_block1d8_h8_sse2
3. vp9_filter_block1d16_h8_sse2
4. vp9_filter_block1d4_v8_sse2
5. vp9_filter_block1d8_v8_sse2
6. vp9_filter_block1d16_v8_sse2
my optimization include:
-processing 2x8 elements in one 128 bit register instead of processing
8 elements in one 128 bit register.
-removing unecessary loads.
This optimization gives between 2.4% user level gain for 480p input
and 1.6% user level gain for 720p.
This Optimization done only for 64bit.

Change-Id: Icb586dc0c938b56699864fcee6c52fd43b36b969
```
  511d218c
- Merge "Removing examples code generation and making them static." · 6d812d6f
  Dmitry Kovalev authored 11 years ago
  
  6d812d6f
- Merge "Using VP9_COMMON instead of VP9_COMP." · 42647fc9
  Dmitry Kovalev authored 11 years ago
  
  42647fc9
- Merge "VP8 for ARMv8 by using NEON intrinsics 01" · c8a2aaa7
  Johann authored 11 years ago
  
  c8a2aaa7
- VP8 for ARMv8 by using NEON intrinsics 01 · 79395e16
  James Yu authored 11 years ago
```
Add bilinearpredict_neon_intrinsics.c
- vp8_bilinear_predict4x4_neon
- vp8_bilinear_predict8x4_neon
- vp8_bilinear_predict8x8_neon
- vp8_bilinear_predict16x16_neon

Change-Id: I33dfa502881219841b442dda32b73220e51b716b
Signed-off-by: James Yu <james.yu@linaro.org>
```
  79395e16
- Merge "Fix rate allocation bug." · 11569060
  Paul Wilkins authored 11 years ago
  
  11569060
- Use the correct member for initialization · 719dadf3
  Johann authored 11 years ago
```
On Windows this fails with:
error C2440: 'initializing': cannot convert from int_mv to uint32_t

Change-Id: I51630efd0e83a0ce620c91aa7859dd6fc1572e99
```
  719dadf3
- Using VP9_COMMON instead of VP9_COMP. · b16fac42
  Dmitry Kovalev authored 11 years ago
```
Change-Id: If7d3958653104f3e170853e931f8489de3ecf3cc
```
  b16fac42
- Merge "Removing direct references to {lst_fb, gld_fb, alt_fb}_idx fields." · d606bf93
  Dmitry Kovalev authored 11 years ago
  
  d606bf93
- Merge "Install test sources for MSVS" · 67ad03ac
  Johann authored 11 years ago
  
  67ad03ac
- Merge "Cleanups around cpi->common." · feaad4f1
  Dmitry Kovalev authored 11 years ago
  
  feaad4f1
- Merge "Renaming 'Mode' to 'mode'." · 4fbe54d2
  Dmitry Kovalev authored 11 years ago
  
  4fbe54d2
08 Jan, 2014 - 18 commits

Install test sources for MSVS · 0239f114

Johann authored 11 years ago

Move the code outside the conditions. The test sources themselves are
also required for Visual Studio.

Change-Id: Id5e93ebc7369e1807eba0b9dc4f7d0f18033d794

0239f114

Optimze inv 16x16 DCT with 10 non-zero coeffs - P1 · ba6ab46c

Jingning Han authored 11 years ago

This commit is the first patch optimizing SSE2 implementation of inverse
16x16 DCT with <10 non-zero coefficients. It focused on the first 1-D (row)
transformation. It exploits the fact that only top-left 4x4 block contains
non-zero coefficients, in a 2-D inverse 16x16 DCT with <10 coeffients.

The average runtime of idct16x16_10 unit is reduced from
883 cycles -> 779 cycles (12% faster).

For pedestrian_area_1080p 300 frames at 4000 kbps, the speed 2 runtime goes
down from 310651 ms  -> 305910 ms. The decoding speed goes up from
80.37 fps -> 80.87 fps.

Change-Id: Ic6f3ac5a637a76c07ba73ddaafe318a699fea645

ba6ab46c

Removing direct references to {lst_fb, gld_fb, alt_fb}_idx fields. · 510a8282
Dmitry Kovalev authored 11 years ago
```
Change-Id: Ib1d9628d2b538b6dc27b0db1fa7f40f70ff2072f
```
510a8282
Cleanups around cpi->common. · 0ecd583d
Dmitry Kovalev authored 11 years ago
```
Change-Id: I0c42a729038d0f4cb7bc07f587d066fcb1dfe9d9
```
0ecd583d
Merge "Add a C fallback for get_msb() and change inline to INLINE." · 8fcb74e6
Alex Converse authored 11 years ago

8fcb74e6
Merge "Add initial intra frame neon optimization. 1~2% gain." · 5be0ed30
hkuang authored 11 years ago

5be0ed30
Renaming 'Mode' to 'mode'. · 962c8b24
Dmitry Kovalev authored 11 years ago
```
Change-Id: I6cdd670d66288dbd66228f38bba6b30502d25362
```
962c8b24
Renaming 'Sharpness' to 'sharpness'. · 57be8136
Dmitry Kovalev authored 11 years ago
```
Change-Id: I54513dc3b3321e0c0bb6b15ea5c34085ed80b4a4
```
57be8136
Merge "Using struct twopass_rc* instead of VP9_COMP*." · feab7e11
Dmitry Kovalev authored 11 years ago

feab7e11

Add a C fallback for get_msb() and change inline to INLINE. · ce7ff3b6

Alex Converse authored 11 years ago

For systems without __builtin_clz() or _BitScanReverse(), taken from libwep

Change-Id: Iead257efc1772c466c79e1dc0356ed571d38d43e

ce7ff3b6

Add initial intra frame neon optimization. 1~2% gain. · 691111aa
hkuang authored 11 years ago
```
More intra optimizations will be added.

Change-Id: I33ae8d93f6002bf7b64cc2669602d9e6bfa5a6e8
```
691111aa
Merge "AVX2 Variance Optimization" · a84029ad
Yunqing Wang authored 11 years ago

a84029ad
Merge "Include gen_msvs_vcxproj.sh" · af720818
Johann authored 11 years ago

af720818
Merge "Replace RD modeling with a fixed point approximation." · 22d83a0a
Alex Converse authored 11 years ago

22d83a0a

AVX2 Variance Optimization · 357b6536

levytamar82 authored 11 years ago

Optimizing the variance functions: vp9_variance16x16, vp9_variance32x32,
vp9_variance64x64, vp9_variance32x16, vp9_variance64x32,
vp9_mse16x16 by migrating to AVX2
some of the functions were optimized by processing 32 elements instead of 16.
some of the functions were optimized by processing 2 loop strides of 16
elements in a single 256 bit register
This optimization gives between 2.4% - 2.7% user level performance gain
and 42% function level gain.

Change-Id: I265ae08a2b0196057a224a86450153ef3aebd85d

357b6536

Replace RD modeling with a fixed point approximation. · f2ca665f
Alex Converse authored 11 years ago
```
Change-Id: I44eb44eb3f36c05d916ef140ef42cc84f72f99ec
```
f2ca665f
Merge "Fix an issue in motion vector prediction stage" · aa9552b0
Jingning Han authored 11 years ago

aa9552b0
Include gen_msvs_vcxproj.sh · 87784e3a
Johann authored 11 years ago
```
Change-Id: I28e9cf9347acd7279df3b841863a248479633265
```
87784e3a