Principal Software Engineer

Microsoft Microsoft · Big Tech · Redmond, WA +1 · Software Engineering

Principal Software Engineer role focused on defining and leading the technical direction for next-generation intelligent, large-scale content platforms. The role involves shaping multi-year technical strategy, driving architectural coherence, and building foundational platforms for multiple product surfaces. Responsibilities include designing distributed backend systems, cloud-native infrastructure, and data platforms at global scale, with a focus on availability, latency, security, and cost efficiency. A key aspect is guiding the integration of LLM-powered capabilities into production systems, defining patterns for retrieval, orchestration, evaluation, and responsible AI, ensuring these are deeply embedded, reliable, and scalable. The role requires exceptional technical judgment, operating in ambiguous spaces, and influencing across teams and organizations, working closely with engineering leaders, product leaders, and applied scientists.

What you'd actually do

  1. Define and drive long-term technical strategy for large-scale distributed systems and platforms spanning multiple teams and organizations.
  2. Establish architectural principles, patterns, and standards that ensure consistency, scalability, and maintainability across services.
  3. Lead the design of system-of-systems architectures, integrating services, data, and AI capabilities into cohesive platforms.
  4. Make high-impact technical decisions that balance innovation, risk, cost, and long-term sustainability.
  5. Architect and evolve backend services, APIs, data infrastructure, and platform capabilities that operate at global scale.

Skills

Required

  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • equivalent experience

Nice to have

  • Master's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • Bachelor's Degree in Computer Science or related technical field AND 15+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • Proven track record of architecting and delivering large-scale distributed systems or platforms used by millions of users.
  • Deep expertise in backend systems, cloud-native architecture, and service-oriented or microservices-based design.
  • Experience owning and evolving production systems with high availability, low latency, and solid operational rigor.
  • Solid fundamentals in system design, distributed systems, data modeling, and performance optimization.

What the JD emphasized

  • Define and drive long-term technical strategy for large-scale distributed systems and platforms spanning multiple teams and organizations.
  • Lead the design of system-of-systems architectures, integrating services, data, and AI capabilities into cohesive platforms.
  • Define how LLMs and intelligent systems are integrated into core platform architecture (not as isolated features).
  • Establish patterns for retrieval, grounding, orchestration, memory, and tool use in production systems.
  • Lead the design of evaluation frameworks for quality, safety, latency, reliability, and business impact.

Other signals

  • Define and drive long-term technical strategy for large-scale distributed systems and platforms spanning multiple teams and organizations.
  • Lead the design of system-of-systems architectures, integrating services, data, and AI capabilities into cohesive platforms.
  • Define how LLMs and intelligent systems are integrated into core platform architecture (not as isolated features).
  • Establish patterns for retrieval, grounding, orchestration, memory, and tool use in production systems.
  • Lead the design of evaluation frameworks for quality, safety, latency, reliability, and business impact.