Senior Researcher - GPU Performance at Microsoft

What you'd actually do

Design, implement, and optimize GPU kernels for complex computational workloads such as AI inferencing.

Research and develop novel optimization techniques for generation of GPU kernels.

Profile and analyze kernel performance using advanced diagnostic tools.

Generate automated solutions for kernel optimization and tuning.

Collaborate with other researchers to improve model performance.

Skills

Required

Doctorate in relevant field or equivalent experience
2+ years of experience in GPU architecture, memory hierarchies, parallel computing and algorithm optimization
2+ years of experience in GPU programming, including performance profiling and optimization tools
Reliable C++ programming skills

Nice to have

5+ years of experience in GPU programming and optimization
expert knowledge of CUDA, ROCm, Triton, PTX, CUTLASS, or similar GPU programming frameworks
Experience with machine learning frameworks (PyTorch, TensorFlow)
Familiarity with compiler optimization techniques and background in auto-tuning and automated code generation
Publication record in relevant conferences or journals

Overview

Generative AI is transforming how people create, collaborate, and communicate - redefining productivity across Microsoft 365 and our customers globally. At Microsoft, we run the biggest platform for collaboration and productivity in the world with hundreds of millions of consumer/enterprise users. Tackling AI efficiency challenges is crucial for delivering these experiences at scale.

Within our Microsoft wide Systems Innovation initiative, we are working to advance efficiency across AI systems, where we look at novel designs and optimizations across AI stacks: models, AI frameworks, cloud infrastructure, and hardware. We are an Applied Research team driving mid- and long-term product innovations. We closely collaborate with multiple research teams and product groups across the globe who bring a multitude of technical knowledge in cloud systems, machine learning and software engineering. We communicate our research both internally and externally through academic publications, open-source releases, blog posts, patents, and industry conferences. Further, we also collaborate with academic and industry partners to advance the state of the art and target material product impact that will affect 100s of millions of customers.

We are looking for a Senior Researcher - GPU Performance – Hardware/Software Codesign researcher to explore hardware/kernel-level optimizations to deliver significant efficiency gains for Large Language Models and Generative AI experiences.

The qualified candidate will have a solid background in GPU architecture, accelerator design, machine learning, or systems research and the ambition to apply them to large scale production systems. This role combines deep technical knowledge in GPU architecture with practical implementation skills to create efficient, scalable computational kernels. Further, the qualified candidate is expected to demonstrate a history of solving hard technical problems and is motivated to tackle the hardest problems in building a full end-to-end AI stack. An entrepreneurial approach and ability to take initiative and move fast are essential.

Have a look at this link for reading: Efficient AI - Microsoft Research

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities

Design, implement, and optimize GPU kernels for complex computational workloads such as AI inferencing.
Research and develop novel optimization techniques for generation of GPU kernels.
Profile and analyze kernel performance using advanced diagnostic tools.
Generate automated solutions for kernel optimization and tuning.
Collaborate with other researchers to improve model performance.
Document optimization strategies and maintain performance benchmarks.
Contribute to the development of internal GPU computing frameworks.

Qualifications

Required Qualifications:

Doctorate in relevant field
- OR equivalent experience.
2+ years of experience in GPU architecture, memory hierarchies, parallel computing and algorithm optimization.
2+ years of experience in GPU programming, including performance profiling and optimization tools.
Reliable C++ programming skills.

Other Requirements: Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:

Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

5+ years of experience in GPU programming and optimization, expert knowledge of CUDA, ROCm, Triton, PTX, CUTLASS, or similar GPU programming frameworks.
Experience with machine learning frameworks (PyTorch, TensorFlow).
Familiarity with compiler optimization techniques and background in auto-tuning and automated code generation.
Publication record in relevant conferences or journals (MLSys, NeurIPS, ICML, ICLR, AISTATS, ACL, EMNLP, NAACL, ISCA, MICRO, ASPLOS, HPCA, SOSP, OSDI, NSDI, etc.)

#M365Core #M365Research #Research

Research Sciences IC4 - The typical base pay range for this role across the U.S. is USD $119,800 - $234,700 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $158,400 - $258,000 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about **requesting accommodations.**