Software Engineer II

Microsoft Microsoft · Big Tech · United States · Software Engineering

Software Engineer II role focused on designing, developing, and optimizing networking infrastructure for large-scale AI training and inference in Azure Cloud. The role involves ensuring high performance, low latency, and minimal jitter for distributed AI workloads, working with cutting-edge networking hardware and software.

What you'd actually do

  1. Design, develop, and optimize networking solutions tailored for large-scale AI training infrastructure. Architect and implement high-performance, low-latency, and low-jitter communication frameworks for distributed systems.
  2. Benchmark, analyze, and enhance the scalability and reliability of networking systems to handle petabyte-scale data transfer.
  3. Debug and resolve complex networking issues in large-scale, high-performance environments.
  4. Drive identification of dependencies and the development of design documents for a product, application, service, or platform.
  5. Create, implement, optimize, debug, refactor, and reuse code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI).

Skills

Required

  • C
  • C++
  • Rust
  • Python
  • Distributed Systems
  • software design and development

Nice to have

  • C#
  • Java
  • JavaScript
  • High Performance Computing
  • Machine Learning middleware
  • Communication Runtime
  • Hardware-Software co-design
  • Profiling and Performance Analysis Tools
  • high performance networking hardware/architecture

What the JD emphasized

  • high-performance
  • low-latency
  • low-jitter
  • large-scale
  • scalability
  • reliability
  • observability
  • performance

Other signals

  • AI training at scale
  • distributed AI supercomputer
  • high-performance AI model training
  • next-generation networking capabilities
  • large-scale AI training and inference