Software Engineer II - Ctj - Poly

Microsoft Microsoft · Big Tech · Redmond, WA +3 · Software Engineering

Software Engineer II role focused on building tools and systems for M365 sovereign cloud operations, including internal platforms, automation, and agentic workflows. The role involves writing code to solve operational challenges, ensuring reliability, and potentially leveraging GenAI. It also includes on-call responsibilities for live site operations, monitoring, and incident response, with a focus on security and compliance within sovereign cloud environments.

What you'd actually do

  1. Creates and implements code for a product, service, or feature, reusing code as applicable with minimal supervision. Writes and learns to create code that is extensible and maintainable. Considers diagnosability, reliability, and maintainability with few defects, and understands when the code is ready to be shared and delivered. Applies coding patterns and best practices to write code (e.g., leveraging state-of-the-art generative artificial intelligence [GenAI], approaches to source code organization, naming conventions).
  2. Acts as a designated responsible individual (DRI), working on-call to monitor a system/product feature/service for degradation, downtime, or interruptions. Alerts stakeholders as to the status and gains approval to restore system/product/service for simple problems. Responds within service level agreement (SLA) timeframe. Escalates issues to appropriate owners
  3. Maintains operations of live site service, following security best practices when responding quickly to mitigate issues while using the minimum required permissions to do so that arise on a rotational, on-call basis. Identifies solutions and mitigations to simple issues and complex issues when applicable impacting performance or functionality of live site services and escalates appropriately. With minimal supervision, improves troubleshooting guides (TSGs), wikis, tests, and telemetry to make on-call better, and recommends user-facing support documentation and additional test coverage to reduce likelihood of future user-initiated incidents
  4. Contributes to identifying dependencies, and incorporates them into the development of design documents for a product area with little oversight. Helps to actively identify other teams and technologies to leverage, how they interact, and where their own system or team can support others. Understands downstream interactions between systems.
  5. Contributes to the identification of requirements for, and development of automation within production and deployment of a complex product feature, targeting zero-touch deployment when possible. Runs code in simulated, or other non-production environments to confirm functionality and error-free runtime for products with little to no oversight.

Skills

Required

  • distributed systems
  • scalable services
  • coding
  • software development
  • on-call
  • live site operations
  • incident response
  • troubleshooting
  • automation
  • design documents
  • customer requirements

Nice to have

  • M365
  • sovereign cloud
  • security best practices
  • compliance

What the JD emphasized

  • agentic workflows
  • generative artificial intelligence [GenAI]

Other signals

  • agentic workflows
  • generative artificial intelligence [GenAI]