Commit 9d67495f authored by Jingning Han's avatar Jingning Han
Browse files

Optimize 32x32 2D inverse DCT for speed-up

This commit exploits the sparsity of quantized coefficient matrix.
It detects each 32x8 array and skip the corresponding inverse
transformation if all entries are zero.

For ped1080p at 8000 kbps, this on average reduces the runtime of
32x32 inverse 2D-DCT SSE2 function from 6256 cycles -> 5200
cycles. It makes the overall encoding process about 2% faster at
speed 0. The speed-up is more pronounceable for the decoding process.

Change-Id: If20056c3566bd117642a76f8884c83e8bc8efbcf
Showing with 152 additions and 68 deletions
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment