Research Intern - Applied Sciences Group

Microsoft Microsoft · Big Tech · Redmond, WA +1 · Applied Sciences

Research intern to investigate Small Language Model (SLM) architectures and techniques, such as recurrent transformers and universal transformers, for maximizing LLM throughput with limited cache on hardware targets like SoCs, GPUs, or NPUs. Will involve model training at scale using Azure compute and collaboration with a multidisciplinary team.

What you'd actually do

  1. work with a small team to investigate recent Small Language Models (SLM) architectures and techniques, such as recurrent transformers and universal transformers, as potential approaches for maximizing the throughput of Large Language Models (LLMs) with limited high-speed cache.
  2. learn how to apply your model training skills at scale using Azure compute.
  3. be mentored by a multidisciplinary team with expertise in both on-device implementation and literature/state-of-the-art (SotA) approaches.
  4. collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community.

Skills

Required

  • Advanced Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field
  • At least 1 year of experience in investigating and modifying transformer-based AI models

Nice to have

  • Project portfolio, open-source code or other verifiable evidence of pursuit of state-of-the-art AI systems
  • experience with different platforms (SoC, GPU, NPU)

What the JD emphasized

  • transformer-based AI models

Other signals

  • investigate recent Small Language Models (SLM) architectures and techniques
  • maximizing the throughput of Large Language Models (LLMs) with limited high-speed cache
  • apply your model training skills at scale using Azure compute
  • mentored by a multidisciplinary team with expertise in both on-device implementation and literature/state-of-the-art (SotA) approaches