1. 18 Dec, 2014 1 commit
  2. 12 Dec, 2014 3 commits
  3. 11 Dec, 2014 7 commits
    • hkuang's avatar
      Remove unnecessary dqcoeff memset. · 3c7a06c3
      hkuang authored
      dqcoeff is set to be 0 on initialization. And set back to 0 after being
      used everytime.
      
      Change-Id: I32b8e149bba40a8d707849f737a8e49a691f319c
      3c7a06c3
    • Jingning Han's avatar
      Replace division with bit shift in choose_partitioning · d5c396a9
      Jingning Han authored
      This commit explicitly uses the bit shift operation instead of
      division for computing block variance.
      
      Change-Id: Id19c0ff27dd1d1ae4aceee6657e1aad0d406bd74
      d5c396a9
    • Peter de Rivaz's avatar
      Corrected optimization of 8x8 DCT code · 5c22224e
      Peter de Rivaz authored
      The 8x8 DCT uses a fast version whenever possible.
      There was a mistake in the checking code which
      meant sometimes the fast version was used when it
      was not safe to do so.
      
      Change-Id: I154c84c9e2d836764768a11082947ca30f4b5ab7
      (cherry picked from commit fd05fb0c)
      5c22224e
    • Jingning Han's avatar
      Refactor choose_partitioning computing scheme · 377d2f02
      Jingning Han authored
      This commit refactors the choose_partitioning function. It removes
      redundant memset calls and makes the encoder to calculate
      variance value per block only when it is needed. It reduces the
      average runtime cost of choose_partitioning by 60%. Overall it
      reduces speed -6 runtime by 2-5%.
      
      Change-Id: I951922c50d901d0fff77a3bafc45992179bacef9
      377d2f02
    • JackyChen's avatar
      Multiframe Quality Enhancement(MFQE) in VP9. · 7ac3e3c1
      JackyChen authored
      It is the first version of MFQE in VP9. There are a few TODOs included
      in this version.
      Usage: Add flag --enable-vp9-postproc to config the project.
      In decoder, use flag --mfqe in the command line to enable
      MFQE in postproc.
      Note: Need to have key frame with low quality to see the effect of this
      new patch. In my experiment, I fixed the qindex to 200 in key frame.
      
      Change-Id: I021f9ce4616ed3574c81e48d968662994b56a396
      7ac3e3c1
    • James Yu's avatar
      VP9 common for ARMv8 by using NEON intrinsics 18 · 3f7c12da
      James Yu authored
      Add vp9_idct32x32_add_neon.c
      - vp9_idct32x32_1024_add_neon
      
      Change-Id: Ic598b772c28bd3487a8ead7a4598a66b25f9b00f
      Signed-off-by: 's avatarJames Yu <james.yu@linaro.org>
      3f7c12da
    • James Yu's avatar
      VP9 common for ARMv8 by using NEON intrinsics 14 · 3cfed4bf
      James Yu authored
      Add vp9_idct16x16_add_neon.c
      - vp9_idct16x16_256_add_neon_pass1
      - vp9_idct16x16_256_add_neon_pass2
      - vp9_idct16x16_10_add_neon_pass1
      - vp9_idct16x16_10_add_neon_pass2
      
      Change-Id: I54d25b54a36f4371760f54e4036693aaea40a5de
      Signed-off-by: 's avatarJames Yu <james.yu@linaro.org>
      3cfed4bf
  4. 10 Dec, 2014 11 commits
  5. 09 Dec, 2014 8 commits
    • Jingning Han's avatar
      Refactor update_state_rt · e728678c
      Jingning Han authored
      Update the frame motion vector only if previous frame motion vector
      is needed for next frame reference motion vector.
      
      Change-Id: Ica50f9d7b46ad4f815bba0d9e30f5546df29546f
      e728678c
    • hkuang's avatar
      Fix clang ioc warning due to NULL src_mi pointer. · 4eee74d6
      hkuang authored
      The warning only happens in VP9 encoder's first pass due to src_mi
      is not set up yet. But it will not fail the encoder as left_mi and
      above_mi are not used in the first_pass and they will be set up again
      in the second pass.
      
      Change-Id: I12dffcd5fb1002b2b2dabb083c8726650e4b5f08
      4eee74d6
    • James Yu's avatar
      VP9 common for ARMv8 by using NEON intrinsics 01 · 5b098b18
      James Yu authored
      Add vp9_loopfilter_neon.c
      - vp9_lpf_horizontal_4_neon
      - vp9_lpf_vertical_4_neon
      - vp9_lpf_horizontal_8_neon
      - vp9_lpf_vertical_8_neon
      
      Change-Id: I97a0d7b399a431c21ee77396be3d5f5a1f7ebccb
      Signed-off-by: 's avatarJames Yu <james.yu@linaro.org>
      5b098b18
    • Jingning Han's avatar
      Make RTC coding flow support sub8x8 in key frame coding · 225cdef6
      Jingning Han authored
      This commit enables the use of sub8x8 blocks in RTC key frame
      encoding. It requires the block size to be preset and will decide
      the coding mode and encode the bit-stream.
      
      Change-Id: I35aaf8ee2d4d6085432410c7963f339f85a2c19b
      225cdef6
    • Jingning Han's avatar
      Cosmetic naming change · 4bacaab4
      Jingning Han authored
      Rename set_modeinfo_offsets as set_mode_info_offsets, to be more
      consistent with naming convention.
      
      Change-Id: I68ca1f36c4a78127d9439a50c1506a2afd07927d
      4bacaab4
    • Jingning Han's avatar
      Take out redundant setting of mode_info from set_block_size · f051a7be
      Jingning Han authored
      The later encoding process will take the top-left block's
      mode_info for pre-determined block size.
      
      Change-Id: I76a90f9ce7f3b2dbc2975b52442114e461c465b5
      f051a7be
    • Paul Wilkins's avatar
      Substantial restructuring of AQ mode 2. · e68c8dcf
      Paul Wilkins authored
      The restructure moves the decision into the rd pick
      modes loop and makes a decision based at the 16x16
      block level instead of only the 64x64 level.
      
      This gives finer granularity and better visual results
      on the clips I have tested. Metrics results are worse
      than the old AQ2 especially for PSNR and this mode
      now falls between AQ0 and AQ1 in terms of visual
      impact and metrics results.
      
      Further tuning of this to follow.
      
      It should be noted that if there are multiple iterations
      of the recode loop the segment for a MB could change
      in each loop if the previous loop causes a change in the
      complexity / variance bin of the block. Also where a block
      gets a delta Q this will alter the rd multiplier for this block
      in subsequent recode iterations and frames where the
      segmentation is applied.
      
      Change-Id: I20256c125daa14734c16f7cc9aefab656ab808f7
      e68c8dcf
    • Jingning Han's avatar
      Remove unused rd cost calculation from nonrd_use_partition · 1395ded2
      Jingning Han authored
      The per block rd cost calculation is not needed when partition
      size is preset.
      
      Change-Id: Ie5575248bbffb584e908aa13097f697ace6ec747
      1395ded2
  6. 08 Dec, 2014 1 commit
    • levytamar82's avatar
      SSSE3 Optimization for Atom processors using new instruction selection and ordering · 8f9d94ec
      levytamar82 authored
      The function vp9_filter_block1d16_h8_ssse3 uses the PSHUFB instruction which has a 3 cycle latency and slows execution when done in blocks of 5 or more on Atom processors.
      By replacing the PSHUFB instructions with other more efficient single cycle instructions (PUNPCKLBW + PUNPCHBW + PALIGNR) performance can be improved.
      In the original code, the PSHUBF uses every byte and is consecutively copied.
      This is done more efficiently by PUNPCKLBW and PUNPCHBW, using PALIGNR to concatenate the intermediate result and then shift right the next consecutive 16 bytes for the final result.
      
      For example:
      filter = 0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8
      Reg = 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
      REG1 = PUNPCKLBW Reg, Reg = 0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7
      REG2 = PUNPCHBW Reg, Reg = 8,8,9,9,10,10,11,11,12,12,13,13,14,14,15,15
      PALIGNR REG2, REG1, 1 = 0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8
      
      This optimization improved the function performance by 23% and produced a 3% user level gain on 1080p content on Atom processors.
      There was no observed performance impact on Core processors (expected).
      
      Change-Id: I3cec701158993d95ed23ff04516942b5a4a461c0
      8f9d94ec
  7. 06 Dec, 2014 4 commits
  8. 05 Dec, 2014 5 commits