Senior Staff Machine Learning Engineer, Genai Platform

Reddit Reddit · Consumer · United States · Remote · BE Platform

Reddit is seeking a Senior Staff Machine Learning Engineer to lead the vision, strategy, and architecture for their large-scale GenAI Platform. This role involves defining the platform's operating model, driving core capabilities like a unified LLM Gateway, RAG systems, and agentic workflows, and establishing LLMOps standards for scalability, reliability, and developer experience.

What you'd actually do

  1. Lead and execute the vision, strategy, and roadmap for Reddit’s large-scale GenAI Platform.
  2. Define the platform architecture and operating model that enable teams to build, deploy, and scale GenAI products reliably.
  3. Drive the strategy for a unified LAG Gateway supporting internally and externally hosted LLMs through consistent APIs and abstractions.
  4. Set the direction for core platform capabilities such as rate and token limit management, intelligent failover, and production resilience.
  5. Shape Reddit’s approach to an enterprise-grade RAG system

Skills

Required

  • ML Engineering
  • AI Platform Engineering
  • Cloud AI Deployment
  • Kubernetes
  • AWS
  • Google Cloud Storage
  • Terraform
  • Go
  • Python

Nice to have

  • model serving
  • inference pipelines
  • monitoring
  • observability for AI systems

What the JD emphasized

  • lead the vision
  • strategy
  • architecture
  • scale
  • production adoption
  • long-term evolution
  • large-scale GenAI Platform
  • GenAI products reliably
  • enterprise-grade RAG system
  • agentic AI workflows
  • scalability, reliability, performance, and developer experience

Other signals

  • GenAI Platform
  • LLM Gateway
  • RAG system
  • agentic AI workflows
  • LLMOps