What you'd actually do

Lead and develop a team of engineers working across multiple layers of the AI software stack to enable large-scale training and inference.

Set technical vision and execution strategy for model performance benchmarking, optimization, and deployment across GPUs and Microsoft hardware.

Drive performance outcomes by prioritizing and overseeing efforts to benchmark, profile, debug, and optimize training and inference workloads.

Own performance health by establishing mechanisms to monitor regressions, measure impact, and continuously improve time-to-deploy and hardware efficiency.

Partner cross-functionally with research, product, infrastructure, and hardware teams to deliver scalable, production-ready AI performance improvements.

Skills

Required

Computer Science or related technical field Bachelor's Degree
6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
Software engineering principles
computer architecture
GPU architecture
hardware acceleration for neural networks

Nice to have

Master's Degree in Computer Science or related technical field AND 10+ years of software engineering experience, including 6+ years in engineering management
Bachelor's Degree in Computer Science or related technical field AND 12+ years of software engineering experience, including 6+ years in engineering management
leading teams responsible for end-to-end performance analysis and optimization of LLMs, AI systems, or HPC workloads
GPU profiling and performance analysis tools
lead cross-team initiatives
align stakeholders
translate research or platform capabilities into scalable, production-ready solutions
people leadership skills, including hiring, coaching, performance management, and career development
building high-performing, inclusive teams
AI / ML infrastructure
DNN or LLM training and/or inference systems
PyTorch
TensorFlow
ONNX Runtime
GPU software stacks
CUDA
ROCm
Triton

Overview

As a Principal Software Engineering Manager - AI Frameworks on the team, you will lead and grow a group of engineers working across multiple layers of the AI software serving stack, including fundamental abstractions, runtimes, libraries, and application programming interfaces (APIs). You will be responsible for setting technical direction, prioritizing investments, and ensuring the team delivers high-impact performance improvements that enable large-scale model training and inference.

In this role, you will guide the team’s work on benchmarking OpenAI and other large language models (LLMs) across GPUs and Microsoft hardware, driving performance optimization, monitoring regressions, and accelerating time-to-deployment. You will partner closely with researchers, product teams, and platform owners to translate performance insights into production-ready improvements that reduce hardware footprint and support Microsoft Azure’s capex efficiency goals.

Responsibilities

Lead and develop a team of engineers working across multiple layers of the AI software stack to enable large-scale training and inference.
Set technical vision and execution strategy for model performance benchmarking, optimization, and deployment across GPUs and Microsoft hardware.
Drive performance outcomes by prioritizing and overseeing efforts to benchmark, profile, debug, and optimize training and inference workloads.
Own performance health by establishing mechanisms to monitor regressions, measure impact, and continuously improve time-to-deploy and hardware efficiency.
Partner cross-functionally with research, product, infrastructure, and hardware teams to deliver scalable, production-ready AI performance improvements.
Balance short-term delivery and long-term investments, ensuring the team’s work aligns with organizational goals, platform roadmaps, and Azure capex objectives.
Build a strong engineering culture through coaching, feedback, hiring, and career development, enabling the team to operate with increasing autonomy and impact.

Qualifications

Minimum/R****equired Qualifications:

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.

Preferred:

Master’s Degree in Computer Science or related technical field AND 10+ years of software engineering experience, including 6+ years in engineering management, OR Bachelor’s Degree in Computer Science or related technical field AND 12+ years of software engineering experience, including 6+ years in engineering management, or equivalent experience.
Strong technical foundation in software engineering principles, computer architecture, GPU architecture, and hardware acceleration for neural networks, with the ability to guide teams working in these areas.
Experience leading teams responsible for end-to-end performance analysis and optimization of LLMs, AI systems, or HPC workloads, including use of GPU profiling and performance analysis tools.
Demonstrated ability to lead cross-team initiatives, align stakeholders, and translate research or platform capabilities into scalable, production-ready solutions.
Proven people leadership skills, including hiring, coaching, performance management, and career development, with a track record of building high-performing, inclusive teams.
Exposure to AI / ML infrastructure, including DNN or LLM training and/or inference systems, and experience with at least one modern deep learning framework (e.g., PyTorch, TensorFlow, ONNX Runtime).
Familiarity with GPU software stacks and acceleration technologies such as CUDA, ROCm, Triton, or equivalent, sufficient to guide technical direction and evaluate tradeoffs.Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.

#AIInfra

Software Engineering M5 - The typical base pay range for this role across the U.S. is USD $142,800.00 - $274,800.00 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $188,000.00 - $304,200.00 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about **requesting accommodations.**