What you'd actually do

Extend and adapt simulation infrastructure to model new micro-architecture innovations for AI inference.

Analyze performance for current and forward-looking AI inference workloads across latency, throughput, and efficiency dimensions.

Drive design-space exploration using AI-assisted workflows, automation, and large-scale experiment generation.

Communicate performance insights clearly and influence architecture decisions through data-driven recommendations.

Collaborate closely with chip, system, and software architects to propose, evaluate, and iterate on architectural variations.

Skills

Required

C/C++
Python
performance modeling
simulation infrastructure

Nice to have

Master's or Ph.D. in Electrical Engineering, Computer Engineering, or related field
chip architecture
micro-architecture analysis
profiling
bottleneck analysis
experimental design
micro-architecture trade-off analysis
AI inference acceleration features
accelerator or GPU performance analysis
AI inference software stack
compilers
runtimes
model serving systems
architectural simulators
performance modeling codebases

Overview

We are forming a small, agile engineering team to accelerate a new initiative focused on artificial intelligence (AI) performance - from micro-architecture exploration through end-to-end workload validation. You’ll work at the intersection of silicon, systems, and software, partnering cross-functionally with chip and system architects, as well as inference software engineers to drive data-backed design decisions and deliver step-function improvements in throughput, latency, and efficiency.

As a Principal Silicon Performance Architect on this AI acceleration effort, you will own performance modeling and analysis for current and future AI workloads. You will translate hardware and software architectural ideas into simulator implementations, run rigorous experiments across design variants, and turn results into clear guidance for architecture and product decisions.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities

• Extend and adapt simulation infrastructure to model new micro-architecture innovations for AI inference. • Analyze performance for current and forward-looking AI inference workloads across latency, throughput, and efficiency dimensions. • Drive design-space exploration using AI-assisted workflows, automation, and large-scale experiment generation. • Communicate performance insights clearly and influence architecture decisions through data-driven recommendations. • Collaborate closely with chip, system, and software architects to propose, evaluate, and iterate on architectural variations.

Qualifications

Required/Minimum Qualifications:

Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to C/C++, Python
- OR equivalent experience.

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

Advanced Degree (Master's, Ph.D.) in Electrical Engineering, Computer Engineering, or related field.
Experience in chip architecture or micro-architecture analysis at the logical level, including memory, functional units, memory controllers, and Input/Output (I/O) controllers.
Experience in performance engineering, including profiling, bottleneck analysis, experimental design, and micro-architecture trade-off analysis.
Experience using performance modeling or simulation to evaluate hardware/software trade-offs across chip, system, and software teams.
Experience with AI inference acceleration features and accelerator or Graphics Processing Unit (GPU) performance analysis.
Experience with the AI inference software stack, including compilers, runtimes, and model serving systems.
Experience modifying architectural simulators or performance modeling codebases (e.g., C, C++, or Python).

Software Engineering IC6 - The typical base pay range for this role across the U.S. is USD $163,000 - $296,400 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $220,800 - $331,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about **requesting accommodations.**