Member of Technical Staff, Senior Applied AI Engineer

Microsoft Microsoft · Big Tech · Redmond, WA +1 · Data Science

Senior Applied AI Engineer role focused on building and shipping LLM-powered assistant features and agentic systems. Responsibilities include designing and developing conversational flows, retrieval pipelines, and multimodal interactions using prompt architectures and orchestration logic. The role also involves building evaluation frameworks, running hillclimbing loops for continuous improvement, and developing internal tools for experimentation and debugging. Integration with product surfaces and building lightweight ML components are key. The team operates with startup energy in a fast-moving AI environment.

What you'd actually do

  1. Design and ship LLM‑powered assistant features, including conversational flows, agentic behaviors, retrieval pipelines, and multimodal interactions.
  2. Build prompt architectures, system instructions, and orchestration logic that ensure reliability, grounding, and personality consistency.
  3. Build and maintain evaluation frameworks for correctness, safety, grounding, and UX quality.
  4. Run hillclimbing loops across prompts, models, and tool‑use strategies to continuously improve assistant performance.
  5. Develop internal tools for prompt experimentation, model comparison telemetry and debugging automated eval pipelines

Skills

Required

  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.

Nice to have

  • Master's Degree AND 3+ years of experience in engineering, problem solving, model building, evaluation, data analysis OR equivalent experience.
  • 2+ years shipping production-level code, models, or data analysis.
  • 1+ years using AI-assisted coding and analysis techniques.
  • Experience working on small teams and mid-stage startup environments.
  • Experience working on AI products.
  • PhD in engineering, applied math, statistics, or related analytical field.
  • 4+ years shipping production-level code, models, or data analysis.
  • Deep experience building from zero-to-one.
  • Hands on work hillclimbing AI evaluations.

What the JD emphasized

  • shipping production-level code, models, or data analysis
  • building from zero-to-one
  • Hands on work hillclimbing AI evaluations

Other signals

  • LLM product engineering
  • evaluation science
  • hillclimbing
  • internal tool building
  • LLM-powered assistant features
  • agentic behaviors
  • retrieval pipelines
  • multimodal interactions
  • prompt architectures
  • orchestration logic
  • evaluation frameworks
  • hillclimbing loops
  • model comparison telemetry
  • automated eval pipelines
  • reusable frameworks
  • lightweight ML components
  • ranking
  • classification
  • summarization
  • personalization