LLM Platform Engineer

Whatnot · Consumer · San Francisco, CA · Engineering

Seeking an LLM Platform Engineer to build and scale the core infrastructure for large language model applications at Whatnot. This role involves designing and deploying RAG systems, developing LLM evaluation frameworks, and creating human-in-the-loop feedback pipelines to integrate AI into critical business functions like growth, recommendations, trust and safety, and fraud.

What you'd actually do

  1. Own the infrastructure powering LLMs across critical business surfaces– supporting growth, recommendations, trust and safety, fraud, seller tooling, and more.
  2. Create robust and scalable LLM evaluation frameworks to measure model performance, guide iteration, and prevent regression via CI/CD.
  3. Deploy RAG systems and MCP servers to more effectively ground LLM responses in Whatnot’s business context while enforcing rigorous PII controls.
  4. Design efficient human-in-the-loop feedback pipelines that can be used to inform scalable LLM evaluation
  5. Bridge the gap between research and production, helping to transform experimental ideas into scalable solutions

Skills

Required

  • Python
  • Software Engineering
  • Production Systems
  • PostgreSQL
  • DynamoDB
  • Elasticsearch
  • Redis
  • DataDog
  • Grafana
  • AWS Sagemaker
  • AWS Lambda
  • AWS Kinesis
  • AWS S3
  • AWS EC2
  • AWS EKS/ECS
  • Apache Kafka
  • Apache Flink

Nice to have

  • LLM infrastructure
  • RAG systems
  • LLM evaluation frameworks
  • human-in-the-loop feedback pipelines

What the JD emphasized

  • 4+ years of professional experience developing machine learning systems and algorithms
  • 3+ years of software engineering experience building and maintaining production systems for consumer-scale loads
  • 1+ years of professional experience developing software in Python

Other signals

  • LLM infrastructure
  • RAG systems
  • LLM evaluation frameworks
  • human-in-the-loop feedback