Principal Applied AI Engineering Manager at Microsoft

What you'd actually do

Build and lead the AI engineering pod — Hire, coach, and develop a team of Applied AI Engineers; foster an inclusive, high-trust culture where engineers ship production AI services with ownership and velocity.

Own engineering execution for agentic AI services — Drive the end-to-end lifecycle of production AI agents from spec to deployment, including LLM orchestration, multi-agent workflows, RAG pipelines, and evaluation systems.

Set technical direction and engineering standards — Define architecture patterns, code quality bar, evaluation frameworks, deployment practices, and observability standards for the AI engineering pod. Ensure production-quality C# and Python with TDD, CI/CD, staged rollouts, and full observability.

Own production health and reliability — Ensure deployed AI agents meet quality, performance, and safety standards. Drive incident response, root cause analysis, and continuous improvement for agent systems in production.

Build and maintain evaluation systems — Establish evaluation frameworks including rubrics, golden datasets, and judge agents to validate agent correctness and safety before and after production deployment. Ensure agents graduate through shadow mode to autonomous operation with eval gates at each stage.

Skills

Required

Python
C#
LLM-based systems
prompt engineering
RAG architectures
agent frameworks
cloud platforms (Azure)
cloud-native service development
microservices
containers
CI/CD
people management
team leadership

Nice to have

low-code application development
engineering product/technical program management
data analysis
product development
Dataverse
Power Applications
managing and configuring artificial intelligence solutions
chatbots

Overview

Are you a customer-obsessed, AI-curious engineering leader who thrives in an inclusive, collaborative global team? The Azure Engineering Operations (EngOps) team's mission is to transform Microsoft Cloud customers into fans. Through our deep engineering engagements with customers and teams across Microsoft, we analyze and amplify customer needs and drive the vision to improve Cloud quality, security, and reliability. Our culture of growth mindset and empowerment are central to who we are and how we work.

Our ACES engineering team is building domain specific agents and services that are fundamentally transforming Azure customer support — from reactive, human-only workflows to AI-first systems that resolve customer issues at Azure scale. We are seeking a Principal Applied AI Engineering Manager to build and lead a high-performing AI engineering pod focused on shipping production agentic AI services. You will own the engineering execution for next-generation AI agents — LLM orchestration, multi-agent coordination, RAG-based systems, and evaluation frameworks — that handle thousands of customer cases per month and drive measurable volume reduction across Azure.

This is a leadership role building production AI services that directly impact millions of Azure customers. You will build the team, set the technical direction, own the delivery, and be accountable for engineering outcomes.

Every day, our customers stake their business and reputation on our cloud. You can help Azure EngOps provide our customers with the world-class cloud services they need to succeed.

Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities

• Build and lead the AI engineering pod — Hire, coach, and develop a team of Applied AI Engineers; foster an inclusive, high-trust culture where engineers ship production AI services with ownership and velocity.

• Own engineering execution for agentic AI services — Drive the end-to-end lifecycle of production AI agents from spec to deployment, including LLM orchestration, multi-agent workflows, RAG pipelines, and evaluation systems.

• Set technical direction and engineering standards — Define architecture patterns, code quality bar, evaluation frameworks, deployment practices, and observability standards for the AI engineering pod. Ensure production-quality C# and Python with TDD, CI/CD, staged rollouts, and full observability.

• Own production health and reliability — Ensure deployed AI agents meet quality, performance, and safety standards. Drive incident response, root cause analysis, and continuous improvement for agent systems in production.

• Build and maintain evaluation systems — Establish evaluation frameworks including rubrics, golden datasets, and judge agents to validate agent correctness and safety before and after production deployment. Ensure agents graduate through shadow mode to autonomous operation with eval gates at each stage.

• Influence product and platform roadmaps — Partner with product, platform engineering, and Azure service teams to shape the agentic AI platform direction. Translate customer support patterns and demand signals into engineering priorities.

• Drive measurable business impact — Own KPIs including case volume reduction, automation accuracy, resolution time improvement, and customer satisfaction impact. Use data-driven insights to set OKRs and demonstrate engineering ROI.

• Ensure responsible AI and compliance — Partner with AI Governance to embed responsible AI practices, PII protection, action boundaries, and audit trails into all agent systems architecturally.

• Develop engineering talent — Mentor engineers on applied AI engineering craft, including agentic system design, evaluation-driven development, and the judgment to know where agents should and should not act autonomously. Build career growth paths across AI engineering competencies.

Qualifications

Required qualifications

Bachelor's Degree AND 6+ years experience in low-code application development, engineering product/technical program management, data analysis, or product development OR equivalent experience.

**Additional or preferred qualifications **

Bachelor's Degree AND 12+ years experience in low-code application development, engineering product/technical program management, data analysis, or product development OR equivalent experience.
Proficiency in Python and C# with hands-on experience building production AI/ML systems.
Experience with LLM-based systems, including prompt engineering, RAG architectures, or agent frameworks.
Experience with cloud platforms (Azure) and cloud-native service development (microservices, containers, CI/CD).
3+ years people management and/or informal/indirect team leadership experience.
4+ years of experience using low-code/no-code platforms (e.g., Dataverse, Power Applications).
3+ years of experience managing and configuring artificial intelligence solutions (e.g., chatbots).
4+ years of experience with programming/coding.

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about **requesting accommodations.**