Research Intern - AI Systems & Architecture

Microsoft Microsoft · Big Tech · Mountain View, CA +2 · Applied Sciences

Research internship focused on AI systems and architecture, investigating performance modeling, architectural analysis, and emerging inference mechanisms for large-scale AI workloads. The role involves analyzing hardware, software, and model interactions, developing performance models, and prototyping new inference techniques.

What you'd actually do

  1. Investigate emerging AI system architectures and analyze how hardware, software, and model behavior interact across large-scale inference workloads.
  2. Develop and evaluate analytical or simulation-based performance models to identify system bottlenecks, scalability limits, and optimization opportunities.
  3. Prototype or assess new inference mechanisms, including disaggregated execution, sparse/expert model scaling, and hierarchical attention techniques.
  4. Explore next-generation accelerator, memory-architecture, and interconnect technologies, assessing their architectural trade-offs and cost implications.
  5. Conduct experiments, synthesize research findings, and communicate results to mentors and collaborating researchers.

Skills

Required

  • Currently enrolled in a PhD program in Computer Science, Electrical/Computer Engineering, or a related field.

Nice to have

  • Research experience in areas such as computer architecture, AI/ML systems, performance modeling, distributed systems, or hardware–software co-design.
  • Programming skills in Python, C/C++ with experience building prototypes, simulators, or performance analysis tools.
  • Familiarity with modern AI workloads and/or deep learning frameworks (e.g., PyTorch).
  • Demonstrated ability to define and pursue original research directions in AI systems or architecture.
  • Ability to collaborate effectively with researchers across disciplines and work in cross-group, cross-cultural environments.
  • Proficient communication and presentation skills for sharing complex technical insights.
  • Ability to think creatively and approach system and architecture challenges with unconventional or innovative solutions.
  • Experience with PyTorch, CUDA, Triton, or performance-simulation tools.
  • Background in large-scale system design, AI inference bottleneck analysis, or modeling cost/performance tradeoffs.
  • Understanding of accelerator, memory-system, or interconnect design principles.

What the JD emphasized

  • Currently enrolled in a PhD program in Computer Science, Electrical/Computer Engineering, or a related field.

Other signals

  • AI systems and architecture
  • performance modeling
  • architectural analysis
  • inference mechanisms
  • disaggregated inference
  • sparse/expert model scaling
  • accelerator, memory-architecture, and interconnect technologies