Principal Software Engineer

Microsoft Microsoft · Big Tech · IN · Software Engineering

Principal Software Engineer on the Azure Storage team, responsible for building and managing persistent cloud storage for Microsoft Azure. The role involves designing and implementing core platform capabilities and critical infrastructure for hyperscale storage systems, tackling challenges in fault tolerance, consistency, and cost-efficiency at zettabyte scale. Focus is on distributed systems, scalability, and reliability.

What you'd actually do

  1. Leads by example across teams and mentors others to produce extensible, maintainable, well-tested, secure, and performant code used across products that adheres to design specifications. Leads efforts to continuously improve code performance, testability, maintainability, effectiveness, and cost, while learning about and accounting for relevant trade-offs. Identifies best practices and coding patterns (e.g., leveraging state-of-the-art generative artificial intelligence [GenAI], approaches to source code organization, naming conventions) and provides deep expertise in the coding and validation strategy. Creates and applies metrics to drive code quality and stability, appropriate coding patterns, and best practices. Identifies and anticipates blockers or unknowns during the development process, escalates them, communicates how they will impact timelines, and then leads efforts to identify and implement strategies and/or opportunities to address them.
  2. Owns and leads efforts and discussions for the architecture of aspects of complex products/solutions (e.g., design, cost). Leads the testing and exploration of various design options across a set of complex product/solution scenarios, ensuring the strengths and weaknesses of each option are outlined and making recommendations for which design option is best. Creates proposals for architecture and design documents, and leads testing of hypotheses and proposed complex solutions. Shares and acts on findings from investigations, owns design decisions, and oversees the less experienced team members. Leads the development of design documents that support user stories and other product requirements. Evaluates new technologies to solve classes of problems, and determines how to integrate these technologies within existing systems. Leads design discussions with the team and shares findings/learnings from investigations, holding ownership for design decisions. Leads efforts to ensure system architecture and individual designs meet performance, scalability, resiliency, disaster recovery, cost of goods sold (COGS), and other requirements and expectations. Upholds Microsoft standards of security, privacy, and other compliance requirements and expectations. Understands and coaches less experienced engineers on the importance of building solutions that expand upon the work of others. Leads the refinement of products through data analytics, and makes informed decisions in engineering products through data integration. Reviews complex designs/architectures within and across teams to provide recommendations for improvements.
  3. Applies and identifies best practices and shares information with other engineers for building code based on well-established methods and secure design principles while also applying best practices for new code development and formal validation of security invariants. Leads product development and scaling to customer requirements and applies best practices for meeting scaling needs and performance expectations and security promises.
  4. Leads efforts for experiments that determine the impact of changes using feature flags/flighting in their code, interprets results, and decides on next steps or ship decision from results. Drives identification of the correct metrics for experimentation in determining improving customer value. Drives collaboration efforts with internal partners (e.g., Data Science, product managers) to ensure incorporation of succes

Skills

Required

  • distributed systems
  • cloud storage
  • software design
  • architecture
  • scalability
  • fault tolerance
  • consistency
  • performance
  • security
  • privacy
  • compliance

Nice to have

  • GenAI