Technical Program Manager – ROCm Libraries

AMD · TX, Texas US, USA, US · 27 days ago

WHAT YOU DO AT AMD CHANGES EVERYTHING   At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.   Together, we advance your career.    THE ROLE:   Do you want to drive end-to-end delivery of artificial intelligence, math, computer vision, and communication libraries to enable  high performance computing  and  artificial intelligence ? AMD is searching for a talented and motivated Program Manager to join the GPU libraries team developing  Math L ibraries as part of the  AMD ROCm™ Open Software Platform .     THE PERSON:   You are accustomed to working in a dynamic, geographically distributed agile team, where partnership and collaboration are paramount. You  possess  excellent written and verbal communication skills, strong organization, and attention to detail. With a keen interest in data, you will draw on your strong technical background, analytical capabilities, and interpersonal skills to drive process improvements, improve program operations, and engage with key partners at all levels, from driving technical discussions with developers through to communication with executives.   As a TPM in this group, you will not just track schedules; you will be an active technical partner. You will  facilitate  architectural decision-making,  assist  with the triage and debugging of complex issues, and translate intricate technical requirements for diverse stakeholders ranging from kernel engineers to executive leadership. You will  operate  at the intersection of High-Performance Computing (HPC) and Artificial Intelligence (AI), ensuring our math libraries deliver world-class performance on AMD Instinct™  and  AMD Radeon™   accelerators.     KEY RESPONSIBILITIES:   Program Execution & Delivery: Drive the end-to-end lifecycle of the  ROCm  BLAS stack, including release planning, roadmap definition, and execution for  rocBLAS ,  hipBLASLt , and  hipSPARSELt .     Compiler & Code Gen:  Facilitate  discussions on code generation strategies (e.g., auto-tuning kernels) and optimizations at the Intermediate Representation (IR) level to maximize hardware  utilization .     Technical Triage & Debugging: Actively  participate  in the triage process for software defects. Leverage your technical background to help engineers prioritize bugs, understand root causes (whether in the library or the compiler backend), and unblock critical paths.     Architectural Decision Support:  Facilitate  technical discussions between software architects, research teams, and silicon engineers. Help drive consensus on API designs and optimization strategies for new hardware generations.     Stakeholder Management: Translate complex technical requirements and status updates into clear, actionable communications. Bridge the gap between technical engineering teams and management.     The primary components you will manage include, but are not limited to:   rocBLAS : The foundational BLAS library implemented in HIP C++ and  optimized  for AMD GPUs, covering  Level-1 ,  Level-2 , and  Level-3  linear algebra operations.   hipBLASLt : A high-performance library providing flexible General Matrix-Matrix (GEMM) operations, essential for modern AI/ML workloads that require functionality beyond traditional BLAS (e.g., fusion, post-processing).   hipSPARSELt : A sparse marshalling library that enables high-performance sparse matrix operations on AMD discrete GPUs, critical for  optimizing  memory bandwidth in large-scale AI models.     PREFERRED EXPERIENCE:   Program Management: 5+ years of experience in Technical Program Management, Engineering Management, or as a Senior Software Engineer with leadership responsibilities.     Compiler & Architecture Knowledge:   Experience working with or managing projects involving compiler technologies (e.g., LLVM, GCC, Open64).     Understanding of  Code Generation techniques, JIT compilation, or Intermediate Representations (IR) (e.g., MLIR, LLVM IR) and how they  impact  library performance.     GPU & HPC Domain:   Deep understanding of GPU computing (AMD  ROCm , CUDA, or OpenCL).   Familiarity with High-Performance Computing (HPC) and Artificial Intelligence (AI) workloads.   Knowledge of linear algebra concepts (GEMM, Sparse Matrices, Tensor operations).   Technical Literacy: Ability to read and understand technical documentation, bug reports, and basic code structures (C++, Python, Assembly/ISA familiarity is a plus).   Process & Tools:  Proficiency  with Agile/Scrum methodologies and tools like Jira, Confluence, and GitHub.   Library Development: Hands-on experience developing or managing math libraries (BLAS, FFT, LAPACK) or similar performance-critical software.       ACADEMIC CREDENTIALS:   Bachelor’s or  Master’s degree in Computer Science , Software Engineering, Electrical Engineering, Mathematics, or equivalent strongly preferred   Certifications such as the PMP or agile certification would be an asset   THE ROLE   AMD is seeking a  Technical Program Manager  to join the ROCm Libraries organization and lead execution across the  MIOpen, Composable Kernel (CK), and hipDNN  software stack.   In this role, you will be responsible for  end‑to‑end program execution  across a complex, performance‑critical set of GPU software libraries that support training and inference workloads on AMD Instinct™ GPUs. You will work closely with engineering and product leadership to drive  predictable delivery, execution rigor, and continuous improvement of software development practices  across multiple teams.   This role operates at the intersection of  kernel performance, library architecture, build systems, and customer‑driven requirements , requiring strong technical judgment, structured execution, and clear communication across deeply technical stakeholders.     THE PERSON   You have experience operating in technically complex software environments where  performance, correctness, and platform compatibility  are critical. You are comfortable navigating ambiguity and are effective at introducing  structure, execution discipline, and data‑driven decision‑making .   You are able to translate complex engineering trade‑offs into  clear plans, risks, metrics, and delivery commitments , and you communicate effectively with both engineering teams and senior stakeholders.       Key Responsibilities   Program Execution   Lead end-to-end delivery of  MIOpen, hipDNN, and Composable Kernel  across multiple GPU architectures.   Coordinate execution across performance‑critical kernels, APIs, backend integrations, and build systems.   Translate product and customer requirements into clear plans, schedules, and deliverables.   Identify and mitigate risks related to performance, compatibility, build complexity, and cross‑team dependencies.   Provide clear, regular status reporting on progress, risks, and execution health.   Execution Model & SDLC Maturity   Define and evolve execution models (Agile, hybrid, milestone‑driven, performance‑driven) appropriate to team and product needs.   Establish best practices for planning, estimation, prioritization, dependency management, and delivery accountability.   Assess execution maturity across teams and drive pragmatic, adoptable improvements in partnership with engineering leadership.   Metrics & Continuous Improvement   Define and track meaningful execution metrics (e.g., predictability, cycle time, throughput, planning accuracy).   Use data and retrospectives to drive continuous improvement and measurable outcomes.   Build durable execution mechanisms, including reviews, retrospectives with follow‑through, and shared visibility dashboards.     PREFERRED EXPERIENCE   Experience managing complex, interdependent software programs in  GPU software, deep learning infrastructure, or high‑performance computing .   Prior experience as a software developer, systems engineer, or technical program manager working close to  performance‑critical or low‑level software .   Demonstrated success improving  software development lifecycle maturity , execution discipline, and delivery predictability.   Familiarity with GPU software stacks, kernel libraries, and related tooling (e.g., HIP, math libraries, build systems).   Strong analytical, reporting, and executive communication skills.   Experience applying different execution models based on product and engineering needs.   Proficiency with Jira, Confluence, and common program management tools.     ACADEMIC CREDENTIALS   Bachelor’s or Master’s degree in  Computer Science, Software Engineering, Electrical Engineering, Mathematics , or a related technical field.   Formal project or program management education or certifications (e.g.,  PMP, Agile, Scrum ) are a plus.     LOCATION:  Austin, Texas   #LI-DR1   #LI-HYBRID     Benefits offered are described:  AMD benefits at a glance.   AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.   AMD may use Artificial Intelligence to help screen, assess or select applicants for this position.  AMD’s “Responsible AI Policy” is available here.   This posting is for an existing vacancy.

Headquarters

TX, Texas US, USA

Work Location

on-site

Job Category

Engineering

Application Deadline

Not specified

Job Type

full-time

Experience Level

manager-level

Application Method

Apply via JobSpring

Salary

Not specified

Quick Search AMD Company in TX, Texas US, USA

Related Jobs

No related jobs found