Senior Software Engineer

Microsoft Microsoft · Big Tech · Suzhou, Jiangsu, China +1 · Software Engineering

Senior Software Engineer role in Microsoft 365 (M365) Workload Management (WLM) team, responsible for managing execution of millions of background and system tasks for M365 backend servers. The role focuses on maximizing resource utilization, ensuring system stability and efficiency, and driving intelligent automated operations. Requires strong software engineering fundamentals, experience with large-scale distributed systems, and ability to own features from design to production.

What you'd actually do

  1. Help build integrated solutions to protect M365 system from disruptive outage/crisis.
  2. Develop and implement best practices for resource utilization and backend server management.
  3. Work hands-on with the team and team clients through design and implementation, maintain communication with key partners across the Microsoft ecosystem of engineers.
  4. Take responsibility for technical problem solving, including creatively meeting product objectives and developing best practices.
  5. Continuously learn about evolving hardware and workload scenarios to inform optimization strategies.

Skills

Required

  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • Proficiency in one or more programming languages such as C#, C++, Java, or similar, with solid software design and engineering fundamentals.
  • Experience developing and operating large-scale distributed systems, including debugging, performance tuning, and reliability improvements.
  • Demonstrated ability to own components or end-to-end features, from design through production and live-site support.

Nice to have

  • Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • Experience building or evolving platform-level systems (e.g., scheduling, admission control, load management, storage, or backend infrastructure).
  • Solid understanding of distributed systems concepts, including concurrency control, resource management, fault tolerance, and scalability patterns.
  • Experience with data-driven systems, telemetry, monitoring, and signal-based decision making (e.g., health signals, throttling, load balancing).
  • Hands-on experience with cloud-native architectures and services (Azure, AWS, or similar), including: microservices / service-oriented architecture, event-driven systems, containerized workloads or hybrid cloud environments
  • Experience building or supporting systems in AI-powered or high-throughput environments, where workload patterns can shift rapidly (e.g., Copilot-like scenarios).
  • Familiarity with AI/ML workload characteristics (e.g., bursty traffic, high compute/storage demand, latency sensitivity) is a plus.
  • Experience leveraging AI-assisted development tools (e.g., Copilot, code generation, automated diagnostics) to improve engineering productivity.
  • Proven ability to drive technical discussions and influence design decisions across teams.
  • Solid communication skills, with the ability to clearly articulate complex technical concepts.
  • Experience working in cross-team and global environments, partnering effectively with engineering, PM, and infrastructure teams.

What the JD emphasized

  • large-scale distributed systems
  • live-site support