What you'd actually do

Lead performance analysis, profiling, benchmarking, and analytical modeling across GPU and AI accelerator architectures, identifying bottlenecks, architectural trade-offs, and optimization opportunities across hardware, software, and system layers.

Analyze end-to-end AI workloads and serving systems, including model execution, runtime behavior, memory systems, communication collectives, and workload mapping strategies to understand performance, scalability, efficiency, and cost drivers.

Develop performance, efficiency, and system-level models to evaluate new architectural features, memory and interconnect innovations, collective communication mechanisms, and accelerator design choices, driving perf/W and TCO optimization.

Correlate silicon measurements, software traces, and kernel execution behavior with architectural models and simulators to validate assumptions, improve model fidelity, and guide future architecture decisions.

Drive kernel-level, runtime-level, and system-level performance optimizations across AI training and inference workloads, translating workload insights into actionable hardware and software improvements.

Skills

Required

Master's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 3+ years technical engineering experience OR Bachelor's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 5+ years technical engineering experience OR equivalent experience.
Ability to meet Microsoft, customer and/or government security screening requirements.

Nice to have

4+ years of experience in Computer Architecture, AI Systems, or closely related technical domains.
MS or PhD in Computer Architecture, Computer Systems, Electrical Engineering, Machine Learning, High-Performance Computing, or a related field.
Strong understanding of GPU and AI accelerator architectures, including compute pipelines, memory hierarchies, interconnects, collective communication, and parallel execution models.
Experience with analytical performance modeling, architectural simulation, workload characterization, and silicon correlation for accelerator and system design.
Expertise in performance profiling, benchmarking, and root-cause analysis.

What the JD emphasized

AI accelerator platforms

large-scale AI systems

analytical performance modeling

workload characterization

profiling

end-to-end performance analysis

GPU and accelerator architectures

AI training and inference workloads

performance, efficiency, and system-level models

memory and interconnect innovations

collective communication mechanisms

silicon measurements

software traces

kernel execution behavior

architectural models

simulators

hardware and software improvements

Overview

Do you want to be at the forefront of innovating the latest hardware designs to propel Microsoft’s cloud growth? Are you seeking a unique career opportunity that combines technical capabilities, cross-team collaboration, with business insight and strategy?

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees, we come together with a growth mindset, innovate to empower others, and collaborate to achieve our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.

Join the Systems Planning and Architecture (SPARC) team within Microsoft’s Azure Hardware Systems and Infrastructure (AHSI) organization, the team behind Microsoft’s expanding Cloud Infrastructure and for powering Microsoft’s “Intelligent Cloud” mission. Microsoft delivers more than 200 online services to more than one billion individuals worldwide, and AHSI is the team behind our expanding cloud infrastructure. We deliver the core infrastructure and foundational technologies for Microsoft's cloud businesses including Microsoft Azure, Bing, MSN, Office 365, OneDrive, Skype, Teams and Xbox Live.

We are seeking a Senior AI Hardware Architect to join the AI Systems Architecture (ASA) group, where we define and optimize next-generation AI accelerator platforms and large-scale AI systems. In this role, you will drive analytical performance modeling, workload characterization, profiling, and end-to-end performance analysis across GPU and accelerator architectures, working across hardware, software, and system boundaries.

You will analyze real-world AI workloads on modern GPUs and in-house accelerators, identifying performance bottlenecks and architectural trade-offs through modeling, simulation, benchmarking, and silicon measurement. You will develop models to evaluate new architectural features, memory and communication subsystems, collective operations, and workload mappings, while correlating silicon data and software traces with architectural models to drive performance, perf/W, and TCO optimizations.

You will collaborate closely with architecture, microarchitecture, compiler, runtime, networking, and systems teams, contributing to performance modeling, correlation, and analysis tools. Through quantitative analysis and cross-platform insights, you will help shape future AI accelerator and system architectures, improving performance, efficiency, scalability, and cost.

Responsibilities

Lead performance analysis, profiling, benchmarking, and analytical modeling across GPU and AI accelerator architectures, identifying bottlenecks, architectural trade-offs, and optimization opportunities across hardware, software, and system layers.
Analyze end-to-end AI workloads and serving systems, including model execution, runtime behavior, memory systems, communication collectives, and workload mapping strategies to understand performance, scalability, efficiency, and cost drivers.
Develop performance, efficiency, and system-level models to evaluate new architectural features, memory and interconnect innovations, collective communication mechanisms, and accelerator design choices, driving perf/W and TCO optimization.
Correlate silicon measurements, software traces, and kernel execution behavior with architectural models and simulators to validate assumptions, improve model fidelity, and guide future architecture decisions.
Drive kernel-level, runtime-level, and system-level performance optimizations across AI training and inference workloads, translating workload insights into actionable hardware and software improvements.
Design and develop data analysis, correlation, visualization, and performance modeling tools that improve debugging efficiency, architectural insight, and decision-making velocity.
Partner closely with architecture, microarchitecture, compiler, runtime, networking, and systems teams to evaluate design trade-offs and influence product roadmaps through quantitative analysis and technical leadership.
Present performance findings, architectural recommendations, and design trade-offs to senior technical leadership through architecture reviews, technical reports, and strategic planning discussions.

Qualifications

Required Qualifications:

Master's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 3+ years technical engineering experience OR Bachelor's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 5+ years technical engineering experience OR equivalent experience.

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to, the following specialized security screenings:
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

4+ years of experience in Computer Architecture, AI Systems, or closely related technical domains.
MS or PhD in Computer Architecture, Computer Systems, Electrical Engineering, Machine Learning, High-Performance Computing, or a related field.
Strong understanding of GPU and AI accelerator architectures, including compute pipelines, memory hierarchies, interconnects, collective communication, and parallel execution models.
Experience with analytical performance modeling, architectural simulation, workload characterization, and silicon correlation for accelerator and system design.
Expertise in performance profiling, benchmarking, and root-cause analysis using hardware counters, software traces, and workload-level measurements.
Hands-on experience analyzing and optimizing AI kernels, with the ability to connect kernel behavior to architectural and system-level performance.
Experience developing performance, efficiency, or TCO models to evaluate architectural features, memory systems, networking, and large-scale AI deployments.
Strong programming skills in Python and C/C++ for performance analysis, tooling, benchmarking, automation, and data analysis.
Deep understanding of AI and HPC workloads, including training and inference of large-scale transformer-based models.
Experience running and analyzing end-to-end AI workloads on production-scale systems, with the ability to diagnose bottlenecks across hardware, runtime, networking, and system layers.
Familiarity with modern AI frameworks and serving stacks, including PyTorch, vLLM, SGLang, and distributed training or inference frameworks.
Knowledge of modern AI optimization techniques, including quantization, sparsity, sharding strategies, KV-cache management, Flash Attention, and communication-computation overlap.
Strong written and verbal communication skills, with experience presenting architectural analyses, performance studies, and design recommendations to technical stakeholders and leadership.

Hardware Engineering IC4 - The typical base pay range for this role across the U.S. is USD $119,800.00 - $234,700.00 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $160,200.00 - $261,000.00 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

Hardware Engineering IC4 - The typical base pay range for this role across the U.S. is USD $119,800 - $234,700 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $160,200 - $261,000 per year.

Certain roles may be el.igible for benefits and other compensation Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about **requesting accommodations.**