Software Engineer II - AI Engineering / Python/ds

Microsoft Microsoft · Big Tech · Vancouver, BC +1 · Software Engineering

Software Engineer II role focused on building LLM-powered data engineering experiences and infrastructure for Microsoft Fabric, utilizing Apache Spark. The role involves implementing agentic workflows, scalable LLM features, and robust data pipelines, with a strong emphasis on AI engineering, evaluation, and operationalization of LLM systems.

What you'd actually do

  1. Design, build, and ship scalable backend services and/or libraries in Python that power Fabric Data Engineering and Data Science experiences
  2. Develop LLM-enabled capabilities (prompting patterns, tool/function calling, RAG/grounding, orchestration/agents) with strong attention to latency, reliability, and cost
  3. Build robust data pipelines and distributed compute solutions (Spark/PySpark) to support model/data workflows, feature generation, and large-scale analytics
  4. Define evaluation strategies for LLM features (offline/online metrics, quality gates, safety checks), and implement telemetry/monitoring to continuously improve quality
  5. Apply Responsible AI and security/privacy best practices (data handling, governance, access controls) when integrating AI into customer-facing products

Skills

Required

  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Experience building production-grade Python services, libraries, or distributed systems; strong software engineering fundamentals (design, testing, performance, maintainability)
  • Hands-on experience with Apache Spark/PySpark and data engineering patterns for large-scale structured and unstructured data
  • Solid understanding of modern LLM systems and AI Engineering: prompting, grounding/RAG, tool/function calling, agent orchestration, and evaluation methodologies
  • Experience operationalizing AI/ML features: monitoring, telemetry, experimentation (A/B), rollout strategies, and cost/latency optimization
  • Familiarity with cloud-native engineering on Azure (compute, storage, networking) and secure, compliant data handling
  • Experience collaborating across disciplines (PM, design, research, partner teams) to deliver customer-facing AI capabilities

Nice to have

  • Master's Degree in Computer Science or related technical field AND 1+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python

What the JD emphasized

  • production operations
  • modern LLM-based systems
  • production-grade Python services
  • distributed systems
  • strong software engineering fundamentals
  • Apache Spark/PySpark
  • data engineering patterns
  • modern LLM systems
  • AI Engineering
  • prompting
  • grounding/RAG
  • tool/function calling
  • agent orchestration
  • evaluation methodologies
  • operationalizing AI/ML features
  • monitoring
  • telemetry
  • experimentation (A/B)
  • rollout strategies
  • cost/latency optimization
  • cloud-native engineering on Azure
  • secure, compliant data handling
  • customer-facing AI capabilities

Other signals

  • LLM-powered data engineering experiences
  • agentic workflows
  • scalable LLM-backed data features
  • evaluation strategies for LLM features
  • operationalizing AI/ML features