What you'd actually do

Spearhead architecture definition and evaluation of AI accelerator platforms, with a focus on high bandwidth, low latency networks.

Drive end to end optimization of the stack from hardware, the software kernels.

Partner with silicon and platform design teams to co-design infrastructure that meets performance, reliability and deployment goals.

Frame decisions in terms of TCO, performance, flexibility, scalability.

You will be working with state of art networking lab to prototype new network architectures.

Skills

Required

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
Ability to meet Microsoft, customer and/or government security screening requirements

Nice to have

Master’s or Doctoral degree in Electrical Engineering, Computer Engineering, or related fields and 10+ years of technical experience in the domain.
Deep expertise with ethernet networking, RDMA (RoCE, Infiniband), congestion control, and layer 2/3 switching.
Experience architecting scale-out/backend network for AI GPU clusters
Familiarity with scale-up networks such as NVLinks, UALink.
Experience with high radix ethernet switches
Familiarity with AI model execution pipelines, being able to analyze communication flows and its impact on model performance.
Prior contributions in standards committee and experience on hyperscale network deployments would be an added benefit
Skilled in partnering and influencing architects, hardware engineers, and software leads
Ability to manage through ambiguity, bringing clarity and results orientation to engage and energize collaborators and stakeholders
Collaboration skills, teamwork, and sense of presumed responsibility
Verbal and written communication skills, and ability to articulate and engage with both technical and non-technical stakeholders at all levels.
Experience leading and driving complex projects with respect and integrity, including those with multiple workstreams spanning different business and technical disciplines.
Intellectual curiosity and passion about learning and deploying new technologies.
Problem-solving skills, analytical capabilities, and attention to details

Overview

Do you want to be at the forefront of innovating the latest hardware designs to propel Microsoft’s cloud growth? Are you seeking a unique career opportunity that combines technical capabilities, cross-team collaboration, with business insight and strategy?

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees, we come together with a growth mindset, innovate to empower others, and collaborate to achieve our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.

Join the Systems Planning and Architecture (SPARC) team within Microsoft’s Azure Hardware Systems and Infrastructure (AHSI) organization, the team behind Microsoft’s expanding Cloud Infrastructure and for powering Microsoft’s “Intelligent Cloud” mission. Microsoft delivers more than 200 online services to more than one billion individuals worldwide, and AHSI is the team behind our expanding cloud infrastructure. We deliver the core infrastructure and foundational technologies for Microsoft's cloud businesses including Microsoft Azure, Bing, MSN, Office 365, OneDrive, Skype, Teams and Xbox Live.

We are seeking a passionate Principal AI Network Architect to join the AI systems architecture team. The role includes network architecture evaluation, design and optimization for next-gen AI systems. Your work will have a direct influence on Azure product roadmaps.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities

RESPONSIBILITIES:

Leadership: Spearhead architecture definition and evaluation of AI accelerator platforms, with a focus on high bandwidth, low latency networks. Drive end to end optimization of the stack from hardware, the software kernels.
Cross functional collaboration: Partner with silicon and platform design teams to co-design infrastructure that meets performance, reliability and deployment goals. Frame decisions in terms of TCO, performance, flexibility, scalability.
Prototyping: You will be working with state of art networking lab to prototype new network architectures.
Industry influence: Participate in industry consortiums to shape standards, and influence vendor roadmaps.

Qualifications

Required Qualifications

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.

Other Requirements

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

Preferred Qualifications

Master’s or Doctoral degree in Electrical Engineering, Computer Engineering, or related fields and 10+ years of technical experience in the domain.
Deep expertise with ethernet networking, RDMA (RoCE, Infiniband), congestion control, and layer 2/3 switching.
Experience architecting scale-out/backend network for AI GPU clusters
Familiarity with scale-up networks such as NVLinks, UALink.
Experience with high radix ethernet switches
Familiarity with AI model execution pipelines, being able to analyze communication flows and its impact on model performance.
Prior contributions in standards committee and experience on hyperscale network deployments would be an added benefit
Skilled in partnering and influencing architects, hardware engineers, and software leads
Ability to manage through ambiguity, bringing clarity and results orientation to engage and energize collaborators and stakeholders
Collaboration skills, teamwork, and sense of presumed responsibility
Verbal and written communication skills, and ability to articulate and engage with both technical and non-technical stakeholders at all levels.
Experience leading and driving complex projects with respect and integrity, including those with multiple workstreams spanning different business and technical disciplines.
Intellectual curiosity and passion about learning and deploying new technologies.
Problem-solving skills, analytical capabilities, and attention to details

Software Engineering IC5 - The typical base pay range for this role across the U.S. is USD $139,900 - $274,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $188,000 - $304,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about **requesting accommodations.**