Senior Software Engineer, Genai Platform

Reddit Reddit · Consumer · United States · Remote · Safety

Senior Software Engineer to lead the development of a large-scale GenAI Platform at Reddit, focusing on LLM Gateway, RAG applications, agentic workflows, MLOps/LLMOps, and establishing best practices for observability and governance.

What you'd actually do

  1. Contribute to the design, implementation, and maintenance of the LLM Gateway, focusing on features like unified API endpoints for internal/externally hosted LLM, rate/token limit management, and intelligent failover mechanisms to boost uptime and reliability.
  2. Designed and developed ML and Generative AI systems in cloud-based production environments at scale.
  3. Build and manage enterprise-grade RAG applications using embeddings, vector search, and retrieval pipelines.
  4. Implement and operationalize agentic AI workflows with tool use using frameworks such as LangChain and LangGraph.
  5. Drive adoption of MLOps / LLMOps practices, including CI/CD automation, versioning, testing, and lifecycle management.

Skills

Required

  • ML Engineering
  • AI Platform Engineering
  • Cloud AI Deployment
  • Kubernetes at scale
  • cloud-based technologies for supporting an ML platform
  • AWS
  • Google Cloud Storage
  • infrastructure-as-code (Terraform)
  • Go
  • Python
  • scalability
  • reliability
  • performance
  • ease of use

Nice to have

  • model serving
  • inference pipelines
  • monitoring
  • observability for AI systems
  • LangChain
  • Vertex AI Agent Builder
  • TensorFlow
  • PyTorch

What the JD emphasized

  • lead the development of a large-scale GenAI Platform
  • enterprise-grade RAG applications
  • agentic AI workflows
  • MLOps / LLMOps practices
  • observability, monitoring, evaluation, and governance of GenAI pipelines
  • Strong ownership mindset and platform thinking
  • Ability to lead AI platform delivery from concept to production

Other signals

  • LLM Gateway
  • RAG applications
  • agentic AI workflows
  • MLOps / LLMOps practices
  • observability, monitoring, evaluation, and governance of GenAI pipelines