Principal AI Software Architect

Microsoft Microsoft · Big Tech · Redmond, WA +2 · Software Engineering

Principal AI Software Architect role focused on enabling and optimizing machine learning model training workflows on custom hardware (MAIA accelerators). Requires expertise in PyTorch, Triton/CUDA, and understanding of accelerator architecture for efficient deployment of large models.

What you'd actually do

  1. Leads by example across teams and mentors others to produce extensible, maintainable, well-tested, secure, and performant code used across products that adheres to design specifications.
  2. Leads efforts to continuously improve code performance, testability, maintainability, effectiveness, and cost, while learning about and accounting for relevant trade-offs.
  3. Identifies best practices and coding patterns (e.g., leveraging state-of-the-art generative artificial intelligence [GenAI], approaches to source code organization, naming conventions) and provides deep expertise in the coding and validation strategy.
  4. Creates and applies metrics to drive code quality and stability, appropriate coding patterns, and best practices.
  5. Identifies and anticipates blockers or unknowns during the development process, escalates them, communicates how they will impact timelines, and then leads efforts to identify and implement strategies and/or opportunities to address them.

Skills

Required

  • C
  • C++
  • PyTorch
  • CUDA
  • Triton

Nice to have

  • C#
  • Java
  • JavaScript
  • Python
  • Accelerator architecture
  • Mapping of models to accelerators
  • Number formats
  • AI model architecture

What the JD emphasized

  • PyTorch
  • Triton
  • CUDA

Other signals

  • optimizing machine learning model training workflows on cutting-edge hardware
  • bringing up and validating training processes on MAIA accelerators
  • enabling training recipes developed for Microsoft’s first-party accelerators
  • PyTorch and at least one of Triton or CUDA