Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complex axpyf optimization #580

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Commits on Nov 18, 2021

  1. Optimised AXPYF routine for complex float and complex double

    Details:
        - Added SIMD code
        - Processing 5 rows at a time in SIMD loop to improve performance
    
    AMD-Internal: [CPUPL-1054]
    
    Change-Id: I2ac93f25895dccfc42e14be0689e6d4e655d6a0a
    managalv authored and nsinghamd committed Nov 18, 2021
    Configuration menu
    Copy the full SHA
    ea4f639 View commit details
    Browse the repository at this point in the history
  2. Optimized double complex axpyf kernel for zgemv

    Details:
      - Implemented zaxpyf kernel with fuse factor=4 for zgemv.
      - Modified BLAS interface call for zgemv to reduce framework overhead.
      - Directed gemv to dotv in the case where dimension of y vector is 1.
      - when alpha = 0, gemv becomes scalv of Y with beta. Added code to
        return early after scaling Y vector with beta.
    
    AMD-Internal: [CPUPL-1402]
    Change-Id: I2231285fe3060982d4434466346a040b7ab803fc
    nsinghamd committed Nov 18, 2021
    Configuration menu
    Copy the full SHA
    821f856 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    3b4a600 View commit details
    Browse the repository at this point in the history