Software Engineer II

Microsoft Microsoft · Big Tech · United States · Software Engineering

Software Engineer II role focused on designing and building next-generation networking infrastructure for large-scale AI training and inference in Azure Cloud. The role involves developing high-performance, low-latency, and reliable networking capabilities to support distributed AI workloads, working at the intersection of AI and high-performance computing.

What you'd actually do

  1. Design, develop, and optimize networking solutions tailored for large-scale AI training infrastructure. Architect and implement high-performance, low-latency, and low-jitter communication frameworks for distributed systems.
  2. Benchmark, analyze, and enhance the scalability and reliability of networking systems to handle petabyte-scale data transfer.
  3. Debug and resolve complex networking issues in large-scale, high-performance environments.
  4. Drive identification of dependencies and the development of design documents for a product, application, service, or platform.
  5. Create, implement, optimize, debug, refactor, and reuse code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI).

Skills

Required

  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • Ability to meet Microsoft, customer and/or government security screening requirements

Nice to have

  • 1+ years experience with any of the following: High Performance Networking, InfiniBand, RoCE, CUDA

What the JD emphasized

  • high-performance
  • low-latency
  • scalability
  • reliability
  • observability
  • large-scale AI training
  • networking infrastructure for AI training and inference

Other signals

  • AI training infrastructure
  • distributed AI supercomputer
  • large-scale AI training
  • networking infrastructure for AI training and inference