Member of Technical Staff, Software Co-design AI Hpc Systems - Mai Superintelligence Team

Microsoft Microsoft · Big Tech · Mountain View, CA +3 · Software Engineering

This role focuses on the co-design and productionization of next-generation AI systems at datacenter scale, optimizing end-to-end performance and efficiency. It operates at the intersection of models, systems software, networking, storage, and AI hardware, influencing accelerator design, system architectures, and large-scale AI platforms. The role involves analyzing real workloads, developing performance models, and partnering with various teams to drive high-impact ideas into production systems. It also contributes to research and the broader community through publications and open-sourcing.

What you'd actually do

  1. Lead the co-design of AI systems across hardware and software boundaries, spanning accelerators, interconnects, memory systems, storage, runtimes, and distributed training/inference frameworks.
  2. Drive architectural decisions by analyzing real workloads, identifying bottlenecks across compute, communication, and data movement, and translating findings into actionable system and hardware requirements.
  3. Co-design and optimize parallelism strategies, execution models, and distributed algorithms to improve scalability, utilization, reliability, and cost efficiency of large-scale AI systems.
  4. Develop and evaluate what-if performance models to project system behavior under future workloads, model architectures, and hardware generations, providing early guidance to hardware and platform roadmaps.
  5. Partner with compiler, kernel, and runtime teams to unlock the full performance of current and next-generation accelerators, including custom kernels, scheduling strategies, and memory optimizations.

Skills

Required

  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience
  • coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python

Nice to have

  • Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience
  • Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience
  • AI accelerator or GPU architectures
  • Distributed systems and large-scale AI training/inference
  • High-performance computing (HPC) and collective communications
  • ML systems, runtimes, or compilers
  • Performance modeling, benchmarking, and systems analysis
  • Hardware–software co-design for AI workloads

What the JD emphasized

  • productionize next-generation AI systems
  • large-scale training
  • large-scale AI platforms
  • production systems
  • AI hardware
  • AI accelerator
  • distributed systems and large-scale AI training/inference
  • high-performance computing
  • ML systems
  • hardware–software co-design for AI workloads

Other signals

  • AI systems at datacenter scale
  • hardware-software co-design
  • large-scale training and inference
  • next-generation AI platforms
  • optimizing end-to-end performance, efficiency, reliability, and cost