Member of Technical Staff, Software Co-design AI Hpc Systems - Mai Superintelligence Team

Microsoft Microsoft · Big Tech · London, United Kingdom +2 · Software Engineering

This role focuses on the co-design and productionization of next-generation AI systems at datacenter scale, optimizing performance, efficiency, and cost across hardware and software. It involves analyzing workloads, driving architectural decisions, optimizing distributed systems for training and inference, and influencing AI hardware design. The role also includes performance modeling, prototyping, and mentoring.

What you'd actually do

  1. Lead the co-design of AI systems across hardware and software boundaries, spanning accelerators, interconnects, memory systems, storage, runtimes, and distributed training/inference frameworks.
  2. Drive architectural decisions by analyzing real workloads, identifying bottlenecks across compute, communication, and data movement, and translating findings into actionable system and hardware requirements.
  3. Co-design and optimize parallelism strategies, execution models, and distributed algorithms to improve scalability, utilization, reliability, and cost efficiency of large-scale AI systems.
  4. Develop and evaluate what-if performance models to project system behavior under future workloads, model architectures, and hardware generations, providing early guidance to hardware and platform roadmaps.
  5. Partner with compiler, kernel, and runtime teams to unlock the full performance of current and next-generation accelerators, including custom kernels, scheduling strategies, and memory optimizations.

Skills

Required

  • Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or a related technical field, or equivalent practical experience.
  • 10+ years of experience in systems software, hardware architecture, or AI infrastructure.
  • Experience with AI accelerator or GPU architectures.
  • Experience with distributed systems and large-scale AI training/inference.
  • Experience with high-performance computing (HPC) and collective communications.
  • Experience with ML systems, runtimes, or compilers.
  • Experience with performance modeling, benchmarking, and systems analysis.
  • Experience with hardware–software co-design for AI workloads.
  • Proficiency in systems-level programming.

Nice to have

  • Deep understanding of AI hardware and systems
  • Expertise in optimizing AI workloads
  • Familiarity with custom kernels and scheduling strategies
  • Experience with prototyping and productionizing AI ideas
  • Mentoring senior engineers and researchers
  • Setting technical direction

What the JD emphasized

  • 10+ years of experience (or equivalent depth) working across systems software, hardware architecture, or AI infrastructure, with demonstrated impact at scale.
  • Strong background in one or more of the following areas: AI accelerator or GPU architectures, Distributed systems and large-scale AI training/inference, High-performance computing (HPC) and collective communications, ML systems, runtimes, or compilers, Performance modeling, benchmarking, and systems analysis, Hardware–software co-design for AI workloads.

Other signals

  • datacenter scale AI systems
  • hardware-software co-design
  • large-scale training and inference
  • next-generation accelerators
  • productionize AI systems