What you'd actually do

Conduct original research on the design, architecture, and optimization of agentic AI systems, focusing on memory, communication, and orchestration.

Prototype new components for multiagent inference with system-level optimizations (e.g. shared latent memory/KV-cache, agent-level parallelism) using relevant framework tools and inference backends like vLLM and SGLang.

Explore ML & systems codesign opportunities, such as aligning model capabilities with systems constraints, hardware characteristics, and orchestration strategies, and using Pytorch and other relevant tools of LLM fine-tuning on GPU clusters.

Evaluate proposed ideas through real-system experiments, large-scale benchmark evaluation, and empirical studies on real workloads.

Work closely with a multidisciplinary team to address both fundamental and applied research challenges.

Skills

Required

PhD (or near completion) in Computer Science, Machine Learning, Electrical Engineering, or a related field
Strong background in ML-systems co-design, AI inference systems, or machine learning systems.
Demonstrated ability to conduct independent, high-impact research, evidenced by publications, opensource systems, or deployed artifacts.
Ability to work effectively in collaborative, crossdisciplinary research teams.

Nice to have

Familiarity with modern agentic systems, orchestration patterns, or largescale ML infrastructure.
Experience in model post-training, reinforcement learning / evolution strategies, or supervised fine-tuning.
Experience in building high-performance LLM inference systems using SGLang or vLLM.

Overview

At MSR Cambridge we are shaping the future of AI infrastructure by tackling ambitious, longhorizon systems challenges that will define the next generation of AI platforms. Our team explores the full stack from models and systems to software and hardware, while working closely with product teams across Microsoft to translate research breakthroughs into impact at scale.

The Future AI infrastructure (FAI) team is seeking a Postdoctoral Researcher to pursue foundational research on agentic AI systems. The research emphasis will be on multiagent system designs for scalable agentic workloads with ML and systems techniques for efficient memory, communication, and orchestration of heterogeneous agents. This role is a 2 year fixed term contract and will suit candidates excited by openended research questions at the intersection of machine learning, systems, and nextgeneration AI platforms. FAI team’s proven record of breakthroughs (see AOC and MOSAIC), provides a strong pathway for your research to inform and shape future AI system designs in partnership with the broader MSR teams and Microsoft product teams.

Responsibilities

Conduct original research on the design, architecture, and optimization of agentic AI systems, focusing on memory, communication, and orchestration.
Prototype new components for multiagent inference with system-level optimizations (e.g. shared latent memory/KV-cache, agent-level parallelism) using relevant framework tools and inference backends like vLLM and SGLang.
Explore ML & systems codesign opportunities, such as aligning model capabilities with systems constraints, hardware characteristics, and orchestration strategies, and using Pytorch and other relevant tools of LLM fine-tuning on GPU clusters.
Evaluate proposed ideas through real-system experiments, large-scale benchmark evaluation, and empirical studies on real workloads.
Work closely with a multidisciplinary team to address both fundamental and applied research challenges.
Communicate results clearly, sharing insights with the wider team and partner groups
Contribute to an open, multidisciplinary research environment

Qualifications

Required/Minimum Qualifications:

PhD (or near completion) in Computer Science, Machine Learning, Electrical Engineering, or a related field
Strong background in ML-systems co-design, AI inference systems, or machine learning systems.
Demonstrated ability to conduct independent, highimpact research, evidenced by publications, opensource systems, or deployed artifacts.
Ability to work effectively in collaborative, crossdisciplinary research teams.

**Preferred/Additional Qualifications: **

Familiarity with modern agentic systems, orchestration patterns, or largescale ML infrastructure.
Experience in model post-training, reinforcement learning / evolution strategies, or supervised fine-tuning.
Experience in building high-performance LLM inference systems using SGLang or vLLM.

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about **requesting accommodations.**