ppc: reduce overreads when loading 8 pixels in altivec dsp functions
Altivec can only load naturally aligned vectors. To handle possibly unaligned data a second vector is loaded from an offset of the original location and the data is recovered through a vector permutation. Overreads are minimal if the offset for second load points to the last element of data. This is 7 for loading eight 8-bit pixels and overreads are reduced from 16 bytes to 8 bytes if the pixels are 64-bit aligned. For unaligned pixels the overread is reduced from 23 bytes to 15 bytes in the worst case.
Showing with 10 additions and 10 deletions