What you'd actually do

Be a hands-on technical leader, designing, coding, and shipping core serving systems, smart routing, and request distribution for a broad portfolio of LLMs, including OpenAI, Mistral, Grok, DeepSeek, and others.

Build large-scale AI services and platform capabilities that power new products and customer experiences.

Drive cutting-edge innovation in AI systems alongside world-class engineers and cross-functional partners.

Lead through architecture, code reviews, mentorship, and technical excellence while staying close to implementation.

Improve reliability, scalability, observability, efficiency, and performance across mission-critical services.

Skills

Required

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Java
Ability to meet Microsoft, customer and/or government security screening requirements

Nice to have

4+ years of design and problem-solving experience, with understanding of system performance, scalability, and engineering best practices.
Understanding of distributed systems specifically in request serving at scale; (e.g. inferencing, L7 gateways, high-performance storage, distributed databases across global-scale infrastructure)
Demonstrated experience in building high-quality, reliable systems at scale.
Experience using modern AI-assisted development tools and workflows to move faster, improve quality, and amplify engineering impact.
Customer-obsessed approach to problem solving, with empathy and a drive to deliver impactful solutions.

Overview

Join our team within CoreAI, where we are building the AI data-plane that powers all LLM inferencing workloads across Microsoft and Azure customers—from cutting-edge startups to Fortune 500 enterprises. Our converged AI fabric delivers inference capabilities for all LLMs in Microsoft catalog, including OpenAI, Anthropic, Mistral, Cohere, Llama, and more.

As a Principal Software Engineer, you will shape the future of one of the largest and fastest-growing services in Azure, foundational to Microsoft’s AI strategy. Our mission is to serve models at scale—reliably, efficiently, and with ultra-low latency—enabling a rich set of AI-powered product experiences.

This is a rapidly evolving space with immense opportunities to learn, innovate, and drive industry-wide impact!

Responsibilities

Be a hands-on technical leader, designing, coding, and shipping core serving systems, smart routing, and request distribution for a broad portfolio of LLMs, including OpenAI, Mistral, Grok, DeepSeek, and others.
Build large-scale AI services and platform capabilities that power new products and customer experiences.
Drive cutting-edge innovation in AI systems alongside world-class engineers and cross-functional partners.
Lead through architecture, code reviews, mentorship, and technical excellence while staying close to implementation.
Improve reliability, scalability, observability, efficiency, and performance across mission-critical services.

Qualifications

Required/Minimum Qualifications:

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Java
- OR equivalent experience.

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred/Additional Qualifications:

4+ years of design and problem-solving experience, with understanding of system performance, scalability, and engineering best practices.
Understanding of distributed systems specifically in request serving at scale; (e.g. inferencing, L7 gateways, high-performance storage, distributed databases across global-scale infrastructure)
Demonstrated experience in building high-quality, reliable systems at scale.
Experience using modern AI-assisted development tools and workflows to move faster, improve quality, and amplify engineering impact.
Customer-obsessed approach to problem solving, with empathy and a drive to deliver impactful solutions.

#AIPLATFORM

#azureai

#coreai

#genai

#aiinference

Software Engineering IC5 - The typical base pay range for this role across the U.S. is USD $139,900 - $274,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $188,000 - $304,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about **requesting accommodations.**