Commit af31b27a authored by Jingning Han's avatar Jingning Han
Browse files

Optimze inv 16x16 DCT with 10 non-zero coeffs - P2

This commit further optimizes SSE2 operations in the second 1-D
inverse 16x16 DCT, with (<10) non-zero coefficients. The average
runtime of this module goes down from 779 cycles -> 725 cycles.

Change-Id: Iac31b123640d9b1e8f906e770702936b71f0ba7f
Showing with 111 additions and 15 deletions
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment