Senior, Data Scientist

Walmart Walmart · Retail · Bangalore, KA, India

Senior Data Scientist role focused on enhancing product content quality for improved search accuracy and customer experience using NLP and Generative AI. Responsibilities include developing, testing, and deploying AI models, with a focus on RAG-based LLM agents, embedding generation, vector databases, and prompt engineering. The role involves collaborating cross-functionally and ensuring model scalability and performance monitoring.

What you'd actually do

  1. Translate complex business problems into data-driven solutions using advanced analytical and mathematical methods.
  2. Identify and source relevant data sets, ensuring data quality and suitability for modeling purposes.
  3. Develop, test, and validate custom analytical models leveraging machine learning, deep learning, and statistical techniques.
  4. Collaborate with cross-functional teams to align data insights with business objectives and support decision-making.
  5. Deploy scalable models into production environments, ensuring sustainability and performance monitoring.

Skills

Required

  • Python
  • PySpark
  • Google Cloud platform
  • model deployment
  • big data platforms
  • machine learning
  • supervised and unsupervised learning
  • NLP
  • Classification
  • Data/Text Mining
  • Multi-modal models
  • Neural Networks
  • Deep Learning Algorithms

Nice to have

  • Embedding generation from training materials
  • storage and retrieval from Vector Databases
  • set-up and provisioning of managed LLM gateways
  • development of Retrieval augmented generation based LLM agents
  • model selection
  • prompt engineering
  • finetuning based on accuracy and user-feedback
  • monitoring and governance

What the JD emphasized

  • Embedding generation from training materials, storage and retrieval from Vector Databases, set-up and provisioning of managed LLM gateways, development of Retrieval augmented generation based LLM agents, model selection, prompt engineering and finetuning based on accuracy and user-feedback, monitoring and governance.

Other signals

  • Developing and deploying advanced AI models, including NLP and Generative AI
  • Embedding generation from training materials, storage and retrieval from Vector Databases, set-up and provisioning of managed LLM gateways, development of Retrieval augmented generation based LLM agents, model selection, prompt engineering and finetuning
  • Deploy scalable models into production environments