Principal Data Scientist, AI Foundations

Capital One Capital One · Banking · New York, NY +2

This role focuses on building and shipping AI/ML solutions for Capital One's mobile app, leveraging LLMs and generative AI. The Principal Data Scientist will partner with cross-functional teams to deliver AI-powered products, adapt and fine-tune LLMs for customer-facing applications, and build ML/NLP models through all phases of development, including training, evaluation, and validation, with a strong emphasis on operationalizing them in production systems serving millions of customers. Experience in training language models, computer vision models, and expertise in areas like training optimization, self-supervised learning, explainability, and RLHF are required, along with a track record of delivering models at scale.

What you'd actually do

  1. Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI powered products that change how customers interact with their money.
  2. Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Hugging Face, LangChain, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data.
  3. Be the expert in Natural Language Processing (NLP) to harness the power of Large Language Models (LLMs), adapt and finetune them for customer facing applications and features.
  4. Build machine learning and NLP models through all phases of development, from design through training, evaluation, and validation; partnering with engineering teams to operationalize them in scalable and resilient production systems that serve 80+ million customers.
  5. Flex your interpersonal skills to translate the complexity of your work into tangible business goals.

Skills

Required

  • Natural Language Processing (NLP)
  • Large Language Models (LLMs)
  • Pytorch
  • AWS
  • Hugging Face
  • LangChain
  • Lightning
  • VectorDBs
  • training language models
  • training computer vision models
  • training optimization
  • self-supervised learning
  • explainability
  • RLHF
  • Python
  • SQL

Nice to have

  • Scala
  • R
  • MBA with a quantitative concentration

What the JD emphasized

  • operationalize them in scalable and resilient production systems
  • delivering models at scale both in training data and inference volumes
  • delivering libraries, platforms, or solution level code to existing products

Other signals

  • building and shipping state of the art scalable architecture, AI/ML solutions
  • deliver AI powered products
  • operationalize them in scalable and resilient production systems
  • delivering models at scale both in training data and inference volumes
  • delivering libraries, platforms, or solution level code to existing products