Senior Software Engineer - Coreai Model Inference & Serving

Microsoft Microsoft · Big Tech · Redmond, WA +2 · Software Engineering

Senior Software Engineer role focused on building and scaling the AI data-plane for LLM inferencing across Microsoft and Azure. The role involves designing, coding, and shipping core serving systems, smart routing, and request distribution for a wide range of LLMs, aiming for reliability, efficiency, and ultra-low latency.

What you'd actually do

  1. Be a hands-on technical leader, designing, coding, and shipping core serving systems, smart routing, and request distribution for a broad portfolio of LLMs, including OpenAI, Mistral, Grok, DeepSeek, and others.
  2. Build large-scale AI services and platform capabilities that power new products and customer experiences.
  3. Drive cutting-edge innovation in AI systems alongside world-class engineers and cross-functional partners.
  4. Lead through architecture, code reviews, mentorship, and technical excellence while staying close to implementation.
  5. Improve reliability, scalability, observability, efficiency, and performance across mission-critical services.

Skills

Required

  • Bachelor's Degree in Computer Science or related technical field
  • 4+ years technical engineering experience
  • coding in languages including, but not limited to, C, C++, C#, or Java

Nice to have

  • 4+ years of design and problem-solving experience
  • understanding of system performance, scalability, and engineering best practices
  • Understanding of distributed systems specifically in request serving at scale
  • Experience using modern AI-assisted development tools and workflows
  • Customer-obsessed approach to problem solving

What the JD emphasized

  • core serving systems
  • request distribution
  • large-scale AI services
  • platform capabilities
  • reliability
  • scalability
  • observability
  • efficiency
  • performance

Other signals

  • serving LLMs at scale
  • ultra-low latency
  • AI data-plane