Senior, Data Scientist

Walmart · Retail · Sunnyvale, CA

Senior Data Scientist (Machine Learning Engineer) on the Catalog Data Quality team at Walmart Global Tech. Focuses on using GenAI, AI, ML, deep learning, computer vision, and optimization for product catalog accuracy and customer experience. Develops, deploys, and scales ML models for attribute extraction, supporting the attribute extraction platform and catalog data quality solutions. Drives adoption of GenAI-powered solutions to improve catalog coverage and accuracy. Owns the full model lifecycle from experimentation to production monitoring.

What you'd actually do

  1. Design, develop, and deploy AI/ML, NLP, LLM models into production environments with a focus on reliability and scalability. Own the full model lifecycle — from experimentation and offline evaluation through serving, monitoring, and iterative improvement in production.
  2. Integrate data science solutions into current business processes.
  3. Develop and recommend process standards and best practices in Machine Learning as applicable to the retail industry.
  4. Spearhead collaborations with other senior team members and stakeholders, leveraging your data science expertise to drive strategic decision-making and optimize business operations
  5. Promote and support company policies, procedures, mission, values, and standards of ethics and integrity.

Skills

Required

  • PhD with >3 years of relevant experience / 4-year bachelor’s degree with > 6 years of experience / Master’s degree with > 4 years of experience
  • AI/ML modeling
  • Generative AI technologies: LLMs, multimodal models, RAG architectures, prompt engineering, and fine-tuning (LoRA/QLoRA)
  • GenAI, DL, Vision-based models
  • classical ML, deep learning, and modern architectures — CNNs, Transformers, and domain-specific variants
  • programming skills across data science, statistical analysis, big data and ML stack
  • MLOps, Spark, Kubernets, GCP
  • Machine Learning
  • NLP
  • Computer Vision
  • Deep-learning
  • Python
  • GenAI

Nice to have

  • LLM Optimization

What the JD emphasized

  • production code
  • deploy and support model services and pipelines
  • extract attributes from product information
  • attribute extraction platform
  • catalog data quality solutions
  • GenAI-powered solutions
  • production environments
  • full model lifecycle
  • production
  • latest AI/ML research
  • production-grade solutions
  • delivering high-impact AI/ML solutions to Production

Other signals

  • deploying and supporting model services and pipelines
  • extract attributes from product information
  • attribute extraction platform
  • catalog data quality solutions
  • GenAI-powered solutions