Cambridge Residency Programme - Researcher in Agentic AI Systems & Infrastructure

Microsoft Microsoft · Big Tech · Cambridge, MA, United Kingdom +1 · Research Sciences

Researcher in Agentic AI Systems & Infrastructure focusing on multiagent system designs, memory, communication, and orchestration using ML and systems techniques. Prototyping components for multiagent inference with system-level optimizations and exploring ML & systems codesign. Evaluating ideas through experiments and benchmarks.

What you'd actually do

  1. Conduct original research on the design, architecture, and optimization of agentic AI systems, focusing on memory, communication, and orchestration.
  2. Prototype new components for multiagent inference with system-level optimizations (e.g. shared latent memory/KV-cache, agent-level parallelism) using relevant framework tools and inference backends like vLLM and SGLang.
  3. Explore ML & systems codesign opportunities, such as aligning model capabilities with systems constraints, hardware characteristics, and orchestration strategies, and using Pytorch and other relevant tools of LLM fine-tuning on GPU clusters.
  4. Evaluate proposed ideas through real-system experiments, large-scale benchmark evaluation, and empirical studies on real workloads.
  5. Work closely with a multidisciplinary team to address both fundamental and applied research challenges.

Skills

Required

  • PhD (or near completion) in Computer Science, Machine Learning, Electrical Engineering, or a related field
  • Strong background in ML-systems co-design, AI inference systems, or machine learning systems.
  • Demonstrated ability to conduct independent, high-impact research, evidenced by publications, opensource systems, or deployed artifacts.
  • Ability to work effectively in collaborative, crossdisciplinary research teams.

Nice to have

  • Familiarity with modern agentic systems, orchestration patterns, or largescale ML infrastructure.
  • Experience in model post-training, reinforcement learning / evolution strategies, or supervised fine-tuning.
  • Experience in building high-performance LLM inference systems using SGLang or vLLM.

What the JD emphasized

  • independent, high-impact research
  • publications, opensource systems, or deployed artifacts

Other signals

  • agentic AI systems
  • multiagent system designs
  • orchestration of heterogeneous agents
  • ML and systems techniques for efficient memory, communication, and orchestration