What you'd actually do

Performance tuning, profiling and analysis of large-scale models for LLM, diffusion, multimodal, RecSys and generative AI, single node and distributed. In addition to exploring various tradeoffs and design decisions.

Develop and improve framework, tools and infrastructure for performance estimation, modeling and reporting.

Provide guidelines to customers on efficient network load-balancing, workload scheduling and model sharding strategies.

Participate in hardware-software co-design for future hardware optimizations – especially on scale-up networks, NIC and scale-out networks.

Help with strategy and roadmap for AMD Collectives and Network optimizations.

Skills

Required

Network, NIC and GPU hardware architecture
software optimization
performance modeling
AI frameworks
inference and training optimization
mapping model architecture to low level software, hardware
PyTorch
JAX
vLLM
SGLang
performance analysis
network hardware architecture

Nice to have

technical leadership skills
work collaboratively with cross-functional teams
Mentor, coach, and inspire a diverse and talented team of researchers and engineers
Excellent written, verbal, and presentation skills
coordinate internally and externally

WHAT YOU DO AT AMD CHANGES EVERYTHING

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. **Together, we advance your career. **

THE ROLE:

This software engineer role will help drive AMD’s strategy, architecture, optimization and tooling to achieve industry-leading AI Pre-training and Distributed Inference Performance on AMD GPU. You will partner across hardware architecture, AI frameworks, compilers, runtime, ROCm, developer tools and models to scale performance analysis and optimization.

As an Engineer of Collectives and Network performance, you will drive the end-to-end technical performance attainment across the entire software stack focusing on getting the best performance on multiple generations of AMD GPUs with a wide range of models including latest state-of-the-art AI models. You will help set the strategy and roadmap for general optimization, accelerating supporting new models and out of box performance.

If you are passionate about performance optimization, getting the best out of the hardware, and shaping the future of AI acceleration, then this role is for you.

THE PERSON:

The ideal candidate will have deep knowledge with Network, NIC and GPU hardware architecture, software optimization, performance modeling, AI frameworks and latest trend in inference and training optimization. Hand-on experience in mapping model architecture to low level software, hardware and understanding the impact of each layer of the stack on model performance. Strong knowledge in latest generative model architecture, especially SoTA models, distributed inference and deployment at scale is crucial.

KEY RESPONSIBILITIES:

Help with strategy and roadmap for AMD Collectives and Network optimizations.
Provide guidelines to customers on efficient network load-balancing, workload scheduling and model sharding strategies.
Performance tuning, profiling and analysis of large-scale models for LLM, diffusion, multimodal, RecSys and generative AI, single node and distributed. In addition to exploring various tradeoffs and design decisions.
Participate in hardware-software co-design for future hardware optimizations – especially on scale-up networks, NIC and scale-out networks.
Develop and improve framework, tools and infrastructure for performance estimation, modeling and reporting.
Communicate and present the results of the performance analysis and modeling to stakeholders, and senior leadership. And provide a concrete recommendation.
Cross team collaboration and working across the organization to identify opportunities and develop strategies.

PREFERRED EXPERIENCE:

Multiple years of technical experience in performance optimization.
Strong technical expertise and experience in performance analysis, projection, and network hardware architecture.
Deep knowledge and hand-on experience of AI Frameworks such as PyTorch, JAX, vLLM, and SGLang.
Strong technical leadership skills, ability to work collaboratively with cross-functional teams.
Mentor, coach, and inspire a diverse and talented team of researchers and engineers.
Excellent written, verbal, and presentation skills, ability to coordinate internally and externally.

ACADEMIC CREDENTIALS:

A PhD or master's degree in computer science, electrical engineering, or a related field.

LOCATION:

San Jose, CA (hybrid)

#LI-MV1

_Benefits offered are described: _AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD’s “Responsible AI Policy” is available here.

_ _

This posting is for an existing vacancy.