Member of Technical Staff, Full Stack - ML Efficiency & Observability - Mai Superintelligence Team

Microsoft Microsoft · Big Tech · Mountain View, CA +2 · Software Engineering

Full Stack Engineer on the MAI Superintelligence Team focused on ML Efficiency & Observability, building capacity management portals and visibility into model performance for ML researchers and executives. The role involves designing and developing features for user interfaces, integrating with backend APIs for training frameworks, and contributing to internal tooling and infrastructure.

What you'd actually do

  1. Design and develop features for our capacity management portal
  2. Design and develop features to provide visibility into model performance and quality across our fleet
  3. Partner with ML researchers and PMs to translate functional requirements into highly functional, intuitive and appealing interfaces
  4. Integrate with backend APIs from schedulers to training frameworks to build visibility across the training life cycle
  5. Explore, develop, and adapt new innovations to the software development process

Skills

Required

  • Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • 4+ years experience in business analytics, data science, software development, data modeling or data engineering work

Nice to have

  • Bachelor’s Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 6+ years experience in business analytics, data science, software development, data modeling or data engineering work
  • OR Master’s Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 4+ years of business analytics, data science, software development, data modeling or data engineering work experience
  • OR equivalent experience.
  • Experience with Capacity Management, Efficiency Management, ML Training and/or Inference
  • Solid expertise in JavaScript / TypeScript, React, HTML, CSS and browser internals
  • Solid understanding of web performance, accessibility, and cross‑browser compatibility
  • Experience with Development & Debugging with dev environments like Visual Studio or Visual Studio Code
  • Software development experience with Generative AI tools
  • Experience in leading technical projects and supporting architectural decisions with data.

What the JD emphasized

  • ML Efficiency & Observability
  • capacity management portal
  • visibility into model performance and quality
  • training and inference infrastructures
  • foundational models require large compute-capacity
  • efficiency improvements
  • ML Training and/or Inference

Other signals

  • ML Efficiency
  • Observability
  • Capacity Management
  • Training and Inference Infrastructures