Senior Machine Learning Engineer

Reddit Reddit · Consumer · United States · Remote · Ads Engineering

Senior Machine Learning Engineer at Reddit focused on building and deploying production ML systems for consumer-facing features like recommendations, search, and content understanding, as well as advertising systems. The role involves the full ML lifecycle, from research and modeling to production deployment and monitoring, with a focus on large-scale data and model pipelines, and improving system performance. Experience with LLM/Gen AI techniques is preferred.

What you'd actually do

  1. Design, build, and deploy production-grade machine learning models and systems at scale
  2. Own the full ML lifecycle: from problem definition and feature engineering to training, evaluation, deployment, and monitoring
  3. Build scalable data and model pipelines with strong reliability, observability, and automated retraining
  4. Work with large-scale datasets to improve ranking, recommendations, search relevance, prediction, content/user understanding, and optimization systems.
  5. Research and apply state-of-the-art machine learning and AI techniques, including deep learning, graph & transformers based, and LLM evaluation/alignment

Skills

Required

  • Python, Java, Go, or similar languages
  • solid software engineering fundamentals
  • ML Fundamentals: a strong grasp of algorithms, from classic statistical learning (XGBoost, Random Forests, regressions) to DL architectures (Transformers, CNNs, GNNs)
  • modern ML frameworks (e.g., PyTorch, TensorFlow)
  • designing scalable ML pipelines, data processing systems, and model serving infrastructure
  • cross-functionally and translate ambiguous product or business problems into technical solutions
  • improving measurable metrics through applied machine learning

Nice to have

  • recommender systems, search/ranking systems, advertising/auction systems, large-scale representation learning, or multimodal embedding systems
  • distributed systems and large-scale data processing (Spark, Kafka, Ray, Airflow, BigQuery, Redis, etc.)
  • real-time systems and low-latency production environments
  • feature engineering, model optimization, and production monitoring
  • LLM/Gen AI techniques, including but not limited to LLM evaluation, alignment, fine-tuning, knowledge distillation, RAG/agentic systems and productionizing LLM-powered products at scale
  • Advanced degree in Computer Science, Machine Learning, or related quantitative field

What the JD emphasized

  • building, deploying, and operating machine learning systems in production
  • large-scale datasets
  • production ML systems
  • LLM/Gen AI techniques
  • productionizing LLM-powered products at scale

Other signals

  • large-scale applied machine learning
  • build systems end-to-end
  • production ML systems
  • massive scale