Technical Lead Manager, ML Infrastructure

Whatnot · Consumer · San Francisco, CA · Engineering

Lead the development and scaling of core ML infrastructure, including low-latency model serving, streaming feature ingestion, distributed training, and high-throughput GPU inference, to power AI/ML applications at consumer scale. This role involves hands-on coding, architectural guidance, and empowering ML scientists.

What you'd actually do

  1. Own the infrastructure powering AI and ML models across critical business surfaces–supporting growth, recommendations, trust and safety, fraud, seller tooling, and more.
  2. Guide the prototyping, deployment, and productionization of novel ML architectures that directly shape user experience and marketplace dynamics.
  3. Help design and scale inference infrastructure capable of serving large models with low latency and high throughput.
  4. Oversee and evolve real-time feature pipelines that feed both our online and offline stores, ensuring single-second feedback from behavioral signals, high reliability, and model training fidelity.
  5. Drive feature platform improvements and expand scope to cover non-ML use cases such as fraud rules where point-in-time backtesting is also critical.

Skills

Required

  • 1+ years of TLM experience developing production machine learning systems at consumer-scale loads
  • 5+ years of hands-on software engineering experience building and maintaining production systems for consumer-scale loads
  • Python
  • operational, search, and key-value databases such as PostgreSQL, DynamoDB, Elasticsearch, Redis
  • ML-specific tools and frameworks such as MLFlow, LitServe, TorchServe, Triton
  • visualization tools for monitoring and logging e.g. DataDog, Grafana
  • cloud computing platforms and managed services such as AWS Sagemaker, Lambda, Kinesis, S3, EC2, EKS/ECS, Apache Kafka, Flink

Nice to have

  • Bachelor’s degree in Computer Science, Statistics, Applied Mathematics or a related technical field, or equivalent work experience.

What the JD emphasized

  • hands-on builders
  • deeply technical leaders
  • shape the future of AI and ML
  • core infrastructure that powers machine learning
  • self-hosted large language model applications
  • cutting-edge models
  • near-realtime features
  • systems that make advanced ML dependable and fast at scale
  • low-latency deep learning model serving
  • streaming feature ingestion
  • distributed training
  • high-throughput GPU inference
  • strong technical depth
  • getting and staying in the weeds
  • code at least a day a week
  • 1+ years of TLM experience developing production machine learning systems at consumer-scale loads
  • 5+ years of hands-on software engineering experience building and maintaining production systems for consumer-scale loads
  • ML-specific tools and frameworks

Other signals

  • building systems that make advanced ML dependable and fast at scale
  • low-latency deep learning model serving
  • streaming feature ingestion
  • distributed training
  • high-throughput GPU inference
  • near-realtime features
  • model iteration