What you'd actually do

contribute to research and exploration in advanced machine learning (ML) systems, focusing on the numeric, data types, and compute technologies that drive the next generation of Artificial Intelligence (AI) workloads at Azure scale.

collaborate across Azure teams to investigate cutting-edge approaches in model efficiency ranging from low-precision formats, quantization strategies, and ML kernel development, to benchmarking and analyzing emerging model architecture and hardware capabilities.

play a critical role in evaluating, prototyping, and analyzing new algorithmic and numerical techniques that improve the performance, cost, and efficiency of training and inference for large-scale models.

develop expertise in ML systems, emerging data types, kernel optimization, and performance modeling while gaining hands-on experience with the latest Azure AI and hardware technologies.

Skills

Required

Python
C++
machine learning systems

Nice to have

transformer-based model architectures
attention mechanisms
KV cache behavior
PyTorch
Hugging Face Transformers
SGLang
vLLM
TensorRT-LLM
GPU programming
CUDA
Triton
profiling
performance analysis
low-precision numeric
quantization methods
hardware–software co-design
ML systems
model optimization
kernel development
numerical computing
analytical skills
problem-solving skills
ML systems
computational performance

Overview

Research Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment.

As part of the Systems Planning and Architecture (SPARC) group, you will contribute to research and exploration in advanced machine learning (ML) systems, focusing on the numeric, data types, and compute technologies that drive the next generation of Artificial Intelligence (AI) workloads at Azure scale. You will collaborate across Azure teams to investigate cutting-edge approaches in model efficiency ranging from low-precision formats, quantization strategies, and ML kernel development, to benchmarking and analyzing emerging model architecture and hardware capabilities.

As a Research Intern, you will be at the forefront of innovation in cloud-scale AI, helping shape how Microsoft designs and deploys efficient and performant ML infrastructure. Your work will directly inform decisions around compute platforms, acceleration strategies, and system-level optimizations that influence internal silicon efforts, software runtimes, and partnerships across the industry.

You will play a critical role in evaluating, prototyping, and analyzing new algorithmic and numerical techniques that improve the performance, cost, and efficiency of training and inference for large-scale models. This Research Internship offers opportunities to develop expertise in ML systems, emerging data types, kernel optimization, and performance modeling while gaining hands-on experience with the latest Azure AI and hardware technologies.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities

Research Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world’s best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides. During the 12-week internship, Research Interns are paired with mentors and expected to collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community. Research internships are available in all areas of research, and are offered year-round, though they typically begin in the summer.

Qualifications

Required Qualifications

Currently enrolled in a master’s, or PhD program in Computer Science, Electrical Engineering, or a related STEM field.
Completed at least 2 academic courses or projects involving machine learning systems.
At least 3 years of experience programming in Python, C++, or a similar systems-oriented language through work, projects, or research.

Other Requirements

Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship.
In addition to the qualifications below, you’ll need to submit a minimum of two reference letters for this position as well as a cover letter and any relevant work or research samples. After you submit your application, a request for letters may be sent to your list of references on your behalf. Note that reference letters cannot be requested until after you have submitted your application, and furthermore, that they might not be automatically requested for all candidates. You may wish to alert your letter writers in advance, so they will be ready to submit your letter.

Preferred Qualifications

Demonstrable Contribution to open-source ML framework or ML systems software.
Deep and strong understanding of transformer-based model architectures, including attention mechanisms, KV cache behavior, and common training and inference bottlenecks.
Experience with modern ML frameworks and runtimes such as PyTorch, Hugging Face Transformers, SGLang, vLLM, or TensorRT-LLM.
Experience with GPU or accelerator programming using CUDA, Triton, or similar tools, and familiarity with profiling and performance analysis.
Familiarity with benchmarking and performance profiling tools for ML workloads.
Working knowledge of low-precision numeric, quantization methods, or hardware–software co-design considerations for large-scale model efficiency is a plus.
Coursework, research, or project experience in areas such as ML systems, model optimization, kernel development, or numerical computing.
Proficient analytical and problem-solving skills, with an interest in ML systems and computational performance.

Applied Sciences IC2 - The base pay range for this internship is USD $5,610 - $11,010 per month.

There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $7,270 - $12,030 per month.

Applied Sciences IC3 - The base pay range for this internship is USD $6,710 - $13,270 per month.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-intern-pay

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about **requesting accommodations.**