(usa) Principal, Data Scientist

Walmart Walmart · Retail · Sunnyvale, CA +1

Principal Data Scientist at Walmart Marketplace responsible for leading the development, scaling, and deployment of complex ML products. This role involves partnering with business stakeholders, guiding other data scientists, driving best practices in ML Ops, and quantifying business impact. Experience in traditional ML and GenAI, including RAG and LLM agents, is required, with a focus on deploying these solutions into production.

What you'd actually do

  1. Consult with business stakeholders regarding algorithm-based recommendations and be a thought leader to develop these into business actions.
  2. Closely partner with the Senior Manager and Director of Data Science to drive data science adoption in the domain.
  3. Guide data scientists, senior data scientists, and staff data scientists across multiple sub-domains to ensure on-time delivery of ML products.
  4. Drive efficiency across the domain in terms of DS and ML best practices, ML Ops practices, resource utilization, reusability, and multi-tenancy.
  5. Lead multiple complex ML products and guide senior tech leads in the domain in efficiently leading their products.

Skills

Required

  • Python
  • R
  • Py Spark
  • Google Cloud Platform
  • Vertex AI
  • Kubeflow
  • model deployment
  • Hadoop
  • Hive
  • Map Reduce
  • HQL
  • Scala
  • classification models
  • regression models
  • NLP
  • forecasting
  • unsupervised models
  • optimization
  • graph ML
  • causal inference
  • causal ML
  • statistical learning
  • experimentation
  • Gen-AI
  • embedding generation
  • vector databases
  • LLM gateways
  • retrieval-augmented generation
  • LLM agents
  • prompt engineering
  • fine-tuning
  • model monitoring
  • model governance
  • fraud prevention
  • shrink and waste reduction
  • inventory management
  • recommendation systems
  • assortment optimization
  • price optimization

Nice to have

  • GPU/CUDA for computational efficiency

What the JD emphasized

  • Minimum 10 years of experience as a data science technical lead.
  • Deep experience in building data science solutions in areas like fraud prevention, forecasting, shrink and waste reduction, inventory management, recommendation, assortment, and price optimization.
  • Deep experience in simultaneously leading multiple data science initiatives end-to-end from translating business needs to analytical asks, leading the process of building solutions, and the eventual act of deployment and maintenance of them.
  • Strong experience in machine learning: classification models, regression models, NLP, forecasting, unsupervised models, optimization, graph ML, causal inference, causal ML, statistical learning, experimentation, and Gen-AI.
  • In Gen-AI, it is desirable to have experience in embedding generation from training materials, storage and retrieval from vector databases, setup and provisioning of managed LLM gateways, development of retrieval-augmented generation-based LLM agents, model selection, iterative prompt engineering and fine-tuning based on accuracy and user feedback, monitoring, and governance.
  • Ability to scale and deploy data science solutions.

Other signals

  • build Machine Learning products
  • drive data science adoption
  • ensure on-time delivery of ML products
  • Drive efficiency across the domain in terms of DS and ML best practices
  • Lead multiple complex ML products
  • Quantify business impact and ensure regular impact measurement of all ML products
  • Utilize a product mindset to build, scale, and deploy holistic data science products
  • Ability to scale and deploy data science solutions