Principal Product Manager/architect - Foundry Inference Platform (coreai)

Microsoft · Big Tech · Redmond, WA +1 · Product Management

The Principal Product Manager/Architect will define and guide the technical architecture of Microsoft Foundry, an AI inferencing platform focused on reliability, scalability, and efficiency for large-scale GPU fleets. The role involves setting product direction for reliability, GPU fleet efficiency, capacity management, and engaging with strategic customers. Success metrics include platform reliability, GPU utilization, and customer outcomes.

What you'd actually do

Own the product direction for Microsoft Foundry inference, with a primary mandate to make the platform the most reliable enterprise inferencing service available.
Set the product direction for GPU fleet efficiency and capacity management, guiding platform-level design decisions that maximize utilization, minimize fragmentation, and accelerate time to monetization of new hardware and models.
Act as a senior technical advisor and architect for Foundry’s most innovative and strategic customers, particularly those pushing the boundaries of scale, reliability, or model complexity.
Serve as a unifying architectural voice across product management, engineering, infrastructure, and partner teams.

Skills

Required

Bachelor's Degree AND 10+ years experience in product/service/program management or software development OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements

Nice to have

Proven technical leadership with deep experience designing and operating planet-scale distributed systems, preferably in cloud, AI, or high-performance compute platforms.
Proven track record owning end-to-end architecture for mission-critical services with strong availability, resilience, and operational guarantees.
Deep understanding of GPU-backed inference systems, capacity management, scheduling

What the JD emphasized

end-to-end accountability for the product direction
deeply engaged in nearterm execution
primary mandate to make the platform the most reliable enterprise inferencing service available
architectural standards for global serving, multi-region resiliency, automated failover, and platform-managed disaster recovery
evolve the system from customer-managed resilience to platform-managed global reliability
architectural alignment across global routing, capacity pooling, observability, and control plane abstractions
reliability targets, SLAs, SLOs and recovery objectives are designed into the platform by default
maximize utilization, minimize fragmentation, and accelerate time to monetization of new hardware and models
architecture for global capacity pooling, intelligent scheduling, fungibility across workloads, automated demand forecasting, and software-defined allocation
deep optimization learnings into durable platform primitives, enabling sustained efficiency gains rather than one-off wins
influence architectural investments across inference utilization, model serving, and hardware/system performance
senior technical advisor and architect for Foundry’s most innovative and strategic customers
pushing the boundaries of scale, reliability, or model complexity
deep technical challenges, including large-scale model migrations, reliability-sensitive production deployments, and advanced serving architectures
articulating Foundry’s architectural advantages, turning bespoke requests into scalable features
customer feedback meaningfully influences platform roadmap and architectural priorities
operate at CTO/Chief Architect level with customers
unifying architectural voice across product management, engineering, infrastructure, and partner teams
Drive alignment on long-term technical direction, resolve architectural tradeoffs
connect technical design choices to business outcomes, including cost efficiency, customer trust, and platform differentiation
Proven technical leadership with deep experience designing and operating planet-scale distributed systems, preferably in cloud, AI, or high-performance compute platforms.
Proven track record owning end-to-end architecture for mission-critical services with strong availability, resilience, and operational guarantees.

Other signals

AI inferencing platform
large-scale GPU fleet management
reliability, efficiency, and customer trust at global scale
planet-scale distributed systems

Read full job description

Overview

We are seeking a Principal Product Manager/Architect to define and guide the technical architecture of Microsoft Foundry as the most reliable, scalable, and efficient AI inferencing platform in the industry. This role sits at the intersection of platform architecture, largescale GPU fleet management, and strategic customer engagement, with end-to-end accountability for the product direction that shape reliability, efficiency, and customer trust at global scale.

This leader will partner with Engineering and Product Management leaders to drive reliability, efficiency and strategic customer engagement while remaining deeply engaged in nearterm execution. The role partners closely with engineering, product, and customer teams across CoreAI, Azure, and 1P products to ensure Foundry delivers industryleading reliability, worldclass GPU efficiency, and differentiated value for Microsoft’s most strategic AI customers.

Responsibilities

Product Reliability

Own the product direction for Microsoft Foundry inference, with a primary mandate to make the platform the most reliable enterprise inferencing service available. This includes defining architectural standards for global serving, multi-region resiliency, automated failover, and platform-managed disaster recovery, evolving the system from customer-managed resilience to platform-managed global reliability.

Drive architectural alignment across global routing, capacity pooling, observability, and control plane abstractions to ensure consistent availability, predictable recovery behavior, and simplified customer operations at scale. Partner with engineering, infrastructure, and security leaders to ensure reliability targets, SLAs, SLOs and recovery objectives are designed into the platform by default, not added as afterthoughts.

GPU Fleet Efficiency & Capacity

Set the product direction for GPU fleet efficiency and capacity management, guiding platform-level design decisions that maximize utilization, minimize fragmentation, and accelerate timetomonetization of new hardware and models.

This includes shaping the architecture for global capacity pooling, intelligent scheduling, fungibility across workloads, automated demand forecasting, and softwaredefined allocation, ensuring Foundry can scale demand while operating within realworld supply constraints. The role will work closely with efficiency and infra teams to translate deep optimization learnings into durable platform primitives, enabling sustained efficiency gains rather than oneoff wins.

The Product Manager/Architect is expected to influence architectural investments across inference utilization, model serving, and hardware/system performance, ensuring that efficiency improvements are systemic, measurable, and repeatable across generations of models and GPUs.

Strategic Customer & Innovation Engagement

Act as a senior technical advisor and architect for Foundry’s most innovative and strategic customers, particularly those pushing the boundaries of scale, reliability, or model complexity. Engage directly with customers on deep technical challenges, including largescale model migrations, reliabilitysensitive production deployments, and advanced serving architectures.

Support competitive and strategic initiatives by articulating Foundry’s architectural advantages, turning bespoke requests into scalable features, and ensuring customer feedback meaningfully influences platform roadmap and architectural priorities. This role will frequently operate at CTO/Chief Architect level with customers, translating complex platform internals into clear, credible architectural guidance.

Cross-Company Technical Leadership

Serve as a unifying architectural voice across product management, engineering, infrastructure, and partner teams. Drive alignment on longterm technical direction, resolve architectural tradeoffs, and provide clear guidance on when to optimize for reliability, efficiency, performance, or speed.

The Product Manager/Architect will regularly engage with senior Microsoft leadership across 1P teams, producing architectural briefs, decision frameworks, and recommendations that connect technical design choices to business outcomes, including cost efficiency, customer trust, and platform differentiation.

What are the success metrics for this role?

Platform Reliability & Trust:
- Reduction in customer-visibility reliability incidents
- Adoption of platform-managed resilience primitives
- Reduce effective RTO/MTTR
Deliver platform scale without customer perceived capacity constraints
- Improve revenue/GPU & Time to Revenue
- Reduce fragmentation across SKU, region & workload
- Self-Service success rate improvement for scale-up/scale-down
Strategic Customer outcomes & Competitive Positioning
- Improved win-rate and retention in competitive deals
- Reduction in bespoke customer exceptions
- Enable self-service migrations and upgrades

Qualifications

Required Qualifications

Bachelor's Degree AND 10+ years experience in product/service/program management or software development OR equivalent experience

Other Requirements

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:

- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter

Preferred Qualifications

Proven technical leadership with deep experience designing and operating planet-scale distributed systems, preferably in cloud, AI, or highperformance compute platforms.
Proven track record owning endtoend architecture for missioncritical services with strong availability, resilience, and operational guarantees.
Deep understanding of GPU-backed inference systems, capacity management, scheduling, and performance optimization at scale.
Demonstrated ability to engage credibly with strategic enterprise customers, solving complex architectural problems and influencing platform direction based on real-world needs.
Exceptional communication skills, with the ability to translate complex technical concepts into clear guidance for executives, partners, and customers.

Product Management IC6 - The typical base pay range for this role across the U.S. is USD $163,000 - $296,400 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $220,800 - $331,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about **requesting accommodations.**