Principal Software Engineer

Microsoft Microsoft · Big Tech · Redmond, WA +1 · Software Engineering

Principal Software Engineer to build and operate mission-critical, hyperscale, high-performance, cost-efficient, and compliant AI infrastructure that powers Microsoft's Large Language Model (LLM) services across Microsoft 365 and other AI-powered products. The role involves leading a team, driving the design and delivery of the AI inferencing platform, and ensuring platform cost efficiency, availability, and operational excellence.

What you'd actually do

  1. Lead, mentor, and grow a high-performing team of engineers, fostering excellence, engagement, and continuous learning.
  2. Drive the design, development, and delivery of the AI Inferencing platform that powers AI experiences for millions of customers.
  3. Own platform cost efficiency, availability, and operational excellence, setting an industry-leading standard for reliability and performance.
  4. Coach engineers in building and operating large-scale distributed systems that serve hundreds of millions of users worldwide.
  5. Collaborate across product and engineering teams throughout Copilot, Power Platform, Business Applications, and Microsoft to deliver innovative AI solutions.

Skills

Required

  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python

Nice to have

  • Master's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • Bachelor's Degree in Computer Science or related technical field AND 15+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • Strong software engineering skills with proficiency in Python and/or C#/.NET.
  • Experience leading technical projects and driving execution across complex engineering environments.
  • Building and operating hyperscale cloud services (4+ years).
  • Large Language Models (LLMs), AI orchestration frameworks, embedding models, and vector databases.
  • Distributed systems design and implementation.
  • Event-driven and message-based architectures.
  • High-scale OLTP or OLAP storage systems.
  • Experience leading complex initiatives that span multiple engineering teams or organizations.
  • Solid written, verbal, and executive communication skills.
  • Passion for AI technologies and delivering reliable, scalable, and cost-efficient customer experiences.

What the JD emphasized

  • AI Inferencing platform
  • hyperscale
  • cost-efficient
  • compliant
  • Large Language Models (LLMs)

Other signals

  • AI infrastructure
  • LLM services
  • hyperscale
  • cost-efficient
  • compliant