Principal Silicon Performance Architect

Microsoft Microsoft · Big Tech · Redmond, WA +2 · Software Engineering

This role focuses on optimizing the performance of AI inference workloads by exploring micro-architectural innovations and validating end-to-end performance. The Principal Silicon Performance Architect will own performance modeling, analysis, and simulation infrastructure, working closely with chip, system, and software architects to drive data-backed design decisions for improved throughput, latency, and efficiency.

What you'd actually do

  1. Extend and adapt simulation infrastructure to model new micro-architecture innovations for AI inference.
  2. Analyze performance for current and forward-looking AI inference workloads across latency, throughput, and efficiency dimensions.
  3. Drive design-space exploration using AI-assisted workflows, automation, and large-scale experiment generation.
  4. Communicate performance insights clearly and influence architecture decisions through data-driven recommendations.
  5. Collaborate closely with chip, system, and software architects to propose, evaluate, and iterate on architectural variations.

Skills

Required

  • C/C++
  • Python
  • performance modeling
  • simulation infrastructure

Nice to have

  • Master's or Ph.D. in Electrical Engineering, Computer Engineering, or related field
  • chip architecture
  • micro-architecture analysis
  • profiling
  • bottleneck analysis
  • experimental design
  • micro-architecture trade-off analysis
  • AI inference acceleration features
  • accelerator or GPU performance analysis
  • AI inference software stack
  • compilers
  • runtimes
  • model serving systems
  • architectural simulators
  • performance modeling codebases

What the JD emphasized

  • AI performance
  • AI inference

Other signals

  • AI performance modeling
  • micro-architecture exploration
  • inference workload validation
  • hardware/software co-design