Staff Machine Learning Engineer, Consumer

Reddit Reddit · Consumer · United States · Remote · Machine Learning

Staff Machine Learning Engineer at Reddit focused on building and deploying large-scale ML systems for consumer products, including recommendations, search, and LLM/GenAI capabilities. The role involves end-to-end ownership from ideation to production, technical strategy, and collaboration with cross-functional teams.

What you'd actually do

  1. Lead end-to-end ML initiatives from ideation through production and iteration, shaping technical direction and translating product goals into scalable solutions
  2. Architect, build and deploy large-scale ML systems across recommendation, search, and content/user understanding, including retrieval/ranking models, representation learnings embeddings optimizations, and LLM or GenAI-powered capabilities
  3. Drive measurable impact on user engagement, discovery, and long-term value
  4. Collaborate with cross-functional teams to align product and technical roadmaps and unlock key future ML capabilities
  5. Stay at the forefront of AI research, evaluating and introducing new AI/ML paradigms to keep Reddit’s ML ecosystem at the cutting edge

Skills

Required

  • 7+ years of experience building, deploying, and operating machine learning systems in production
  • Deep understanding of machine learning methods, spanning classical approaches and modern deep learning (e.g., Transformers, GNN, etc)
  • Expert at developing and productionizing models using TensorFlow, PyTorch, or Hugging Face Transformers
  • Experience building production-quality code incorporating testing, evaluation, and monitoring using object-oriented programming, including experience in Python and Golang
  • Experience designing and scaling ML systems, including data pipelines, feature engineering, model training/serving, and production monitoring
  • Excellent communication and collaboration skills, with the ability to discuss complex technical topics with diverse teams and translating product needs into scalable ML solutions
  • Track record of driving measurable impact through applied machine learning in real-world products

Nice to have

  • Familiarity with distributed systems and large-scale data processing frameworks (Spark, Kafka, Ray, Airflow, BigQuery, Redis, etc.)
  • Experience working with real-time systems and low-latency production environments
  • Experience with LLM/GenAI techniques, including but not limited to LLM evaluation, alignment, fine-tuning, knowledge distillation, RAG/agentic systems and productionizing LLM-powered products at scale
  • Strong experimentation rigor, with experience formulating clear hypotheses, designing actionable learning plans and building offline/online correlations
  • Advanced degree in Computer Science, Machine Learning, or related quantitative field

What the JD emphasized

  • 7+ years of experience building, deploying, and operating machine learning systems in production
  • Expert at developing and productionizing models using TensorFlow, PyTorch, or Hugging Face Transformers
  • Experience building production-quality code incorporating testing, evaluation, and monitoring using object-oriented programming, including experience in Python and Golang
  • Experience designing and scaling ML systems, including data pipelines, feature engineering, model training/serving, and production monitoring
  • Track record of driving measurable impact through applied machine learning in real-world products
  • Experience with LLM/GenAI techniques, including but not limited to LLM evaluation, alignment, fine-tuning, knowledge distillation, RAG/agentic systems and productionizing LLM-powered products at scale

Other signals

  • building systems end-to-end
  • research and modeling to production deployment
  • large-scale applied machine learning
  • recommendations, search, messaging, and foundational AI systems
  • LLM or GenAI-powered capabilities