Staff, Data Scientist – Conversational AI

Walmart · Retail · Bellevue, WA

Staff Data Scientist role focused on building and deploying production-grade conversational AI systems and agentic AI architectures for Walmart's core AI conversational platform, powering intelligent assistants for millions of customers. The role involves designing and developing Generative AI systems, leading agentic AI development, prompt engineering, building end-to-end pipelines, and defining best practices for LLM-based production systems at scale.

What you'd actually do

  1. Design and develop Generative AI and conversational systems powered by transformer-based LLM architectures (e.g., GPT, Claude, Gemini).
  2. Lead the development of agentic AI systems, including system design for multi-agent workflows, reasoning chains, tool use, and orchestration frameworks.
  3. Develop and optimize prompt engineering strategies, including system prompts, structured prompting, retrieval augmentation, and guardrail design.
  4. Build and deploy end-to-end conversational AI pipelines, from data preparation and model training to evaluation, and monitoring.
  5. Design evaluation frameworks and experimentation methodologies for conversational AI systems, including offline benchmarks and online A/B testing.

Skills

Required

  • Master’s degree in Machine Learning, Computer Science, Engineering, Mathematics, Statistics, or related field OR equivalent industry experience.
  • 5+ years of experience in ML Data Science, Applied Machine Learning, or AI-related roles.
  • Strong expertise in Natural Language Processing (NLP), Conversational AI, or Generative AI systems.
  • Hands-on experience designing and deploying LLM-based applications using transformer architectures.
  • Prompt engineering expertise is required, including structured prompting, RAG, system prompt design, and safety techniques.
  • Experience building agentic AI systems and system design for multi-agent architectures.
  • Strong programming skills in Python, and working knowledge of SQL.
  • Strong understanding of ML evaluation metrics, statistical methods, and experimental design.
  • Ability to drive projects from research and prototyping through production launch.

Nice to have

  • PhD in Machine Learning, Computer Science, Statistics, Applied Mathematics, Physics, or related field.
  • Experience building multimodal AI systems combining text, voice, images, and UI context.
  • Experience with production-scale conversational agents or AI assistants.
  • Familiarity with LLM serving optimization techniques, including LoRA and multi-LoRA.
  • Experience designing LLM evaluation frameworks, error analysis pipelines, and model quality monitoring.

What the JD emphasized

  • production-grade conversational systems
  • agentic AI systems
  • Prompt engineering expertise is required
  • agentic AI systems and system design for multi-agent architectures
  • LLM-based production systems at Walmart scale

Other signals

  • production-grade conversational systems
  • applying large language models
  • agentic AI architectures
  • massive scale