Member of Technical Staff, AI Networkin… at Microsoft

What you'd actually do

Advanced ROCE transport design, congestion control, ECN/WRED/DCTCP tuning

Fabric architecture, topology planning, network modeling, and scaling strategy

Telemetry, observability, reliability engineering, and automated troubleshooting

Develop and tune the deployment of novel routing techniques to achieve reliability in large networks

Work with world class network designers like NVIDIA, Broadcom, and in-house silicon/network co-design teams

Skills

Required

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience
coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python

Nice to have

Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience
Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience
coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python

What the JD emphasized

scale the world’s most advanced high-performance networks

enables multi-gigawatt AI supercomputers

supports the training of the most sophisticated AI models on the planet

design, bring up, and scale the distributed Ethernet and InfiniBand fabrics that connect hundreds of thousands of GPUs

AI training + inference cluster bring-up, performance benchmarking, and root-cause analysis

develop the pretraining compute roadmap

Other signals

building the fabric that connects frontier-class datacenters

enables multi-gigawatt AI supercomputers

supports the training of the most sophisticated AI models on the planet

design, bring up, and scale the distributed Ethernet and InfiniBand fabrics that connect hundreds of thousands of GPUs

AI training + inference cluster bring-up, performance benchmarking, and root-cause analysis

gather data and insights to develop the pretraining compute roadmap

Overview

Microsoft AI is hiring a Member of Technical Staff, AI Networking to design and scale the world’s most advanced high-performance networks powering Copilot and next-generation AI systems. Join the team building the fabric that connects frontier-class datacenters, enables multi-gigawatt AI supercomputers, and supports the training of the most sophisticated AI models on the planet.

In our efforts to build these models to develop novel responsible and efficient artificial general intelligence, large compute-capacity is required, and as an AI Networking Engineer, you’ll shape the end-to-end networking architecture, link-layer to fabric-wide systems for hyperscale AI training clusters. design, bring up, and scale the distributed Ethernet and InfiniBand fabrics that connect hundreds of thousands of GPUs across multi-megawatt data halls. You’ll benchmark, profile, debug and tune the training and inference of AI workloads running in the production clusters. You’ll engineer ultra-low-latency ROCE networks, design congestion-free transport mechanisms, optimize lossless fabrics at 10k–100k+ GPU scale, and partner deeply across Azure, Microsoft AI, and datacenter teams to turn cutting-edge ideas into running global infrastructure. If you want to build networking systems that push physics, silicon, and software to the limit and directly accelerate Microsoft’s frontier AI models, this is the most exciting seat in the industry.

Microsoft Superintelligence Team Microsoft Superintelligence team’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.This role is part of Microsoft AI's Superintelligence Team. The MAIST is a startup-like team inside Microsoft AI, created to push the boundaries of AI toward Humanist Superintelligence—ultra-capable systems that remain controllable, safety-aligned, and anchored to human values. Our mission is to create AI that amplifies human potential while ensuring humanity remains firmly in control. We aim to deliver breakthroughs that benefit society—advancing science, education, and global well-being.We’re also fortunate to partner with incredible product teams giving our models the chance to reach billions of users and create immense positive impact. If you’re a brilliant, highly-ambitious and low ego individual, you’ll fit right in—come and join us as we work on our next generation of models!

Starting January 26, 2026, MAI employees are expected to work from a designated Microsoft office at least four days a week if they live within 50 miles (U.S.) or 25 miles (non-U.S., country-specific) of that location. This expectation is subject to local law and may vary by jurisdiction.

Responsibilities

Advanced ROCE transport design, congestion control, ECN/WRED/DCTCP tuning
Fabric architecture, topology planning, network modeling, and scaling strategy
Telemetry, observability, reliability engineering, and automated troubleshooting
Develop and tune the deployment of novel routing techniques to achieve reliability in large networks
Work with world class network designers like NVIDIA, Broadcom, and in-house silicon/network co-design teams
AI training + inference cluster bring-up, performance benchmarking, and root-cause analysis
Gather data and insights to develop the pretraining compute roadmap
Find a path to get things done despite roadblocks to get your work into the hands of users quickly and iteratively
Enjoy working in a fast-paced, design-driven, product development cycle
Embody our Culture and Values

Qualifications

Required Minimum Qualifications

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
- OR equivalent experience.

Additional Preferred Qualifications

Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
- OR equivalent experience.

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about **requesting accommodations.**