Lead AI Engineer

Klaviyo Klaviyo · Enterprise · Palo Alto, CA · Engineering

Lead AI Engineer responsible for designing, building, and scaling backend systems and agentic capabilities for Marketing and Service AI products. This role involves owning complex services end-to-end, contributing to architecture, and ensuring reliable, scalable production capabilities for generative and agentic AI solutions. The focus is on backend-heavy development, including async processing pipelines, distributed systems, AI serving, and evolving agentic architectures, with a strong emphasis on evaluation and best practices for AI systems.

What you'd actually do

  1. Design and build core AI services.
  2. Scale AI data and inference pipelines.
  3. Build and harden AI serving systems.
  4. Evolve our agentic architecture.
  5. Apply and refine best practices for AI systems.

Skills

Required

  • 5-7+ years of professional software engineering experience with a strong focus on backend and distributed systems
  • led complex projects end-to-end within a team and owned services in production
  • built and shipped generative or agentic AI applications (e.g., LLM-backed flows, tool-using agents, retrieval-augmented systems)
  • comfortable with prompt design, few-shot approaches, fine-tuning, and evaluation
  • built reliable services, async processing pipelines, and distributed task queues (e.g., Celery, Kafka, SQS, RabbitMQ, Redis) that support high-throughput workloads
  • Proficient in Python and modern backend frameworks (FastAPI, Django or similar)
  • experience using big data tools such as Spark/Hadoop and ORMs like SQLAlchemy/Alembic
  • Experience with AWS and Kubernetes, CI/CD pipelines, observability, and operational best practices
  • understand how infrastructure choices affect reliability, latency, and cost
  • Evaluation and quality-minded

Nice to have

  • Mentor and uplevel teammates
  • Provide thoughtful code reviews
  • share patterns and examples
  • help mid-level and junior engineers grow stronger in building distributed and AI-powered systems
  • Help shape the Palo Alto hub
  • Participate in interviewing, onboarding, and local engineering rituals
  • contribute ideas that make Palo Alto a highly collaborative, high-bar hub that works seamlessly with other locations

What the JD emphasized

  • agentic capabilities
  • generative and agentic models
  • generative or agentic AI applications
  • tool-using agents
  • retrieval-augmented systems
  • agentic architecture
  • agents plan, call tools, and react to feedback

Other signals

  • designing and building scalable backend systems and user experiences that power our AI products and agentic solutions
  • own complex services end-to-end
  • contribute to architecture for high-impact AI features
  • partner closely with product managers, machine learning engineers, and data scientists to turn AI ideas into reliable, scalable production capabilities
  • hands-on, backend-heavy role with opportunities to influence architecture, async processing pipelines, distributed systems
  • Scale AI data and inference pipelines
  • Build and harden AI serving systems
  • Evolve our agentic architecture
  • Apply and refine best practices for AI systems
  • Measure what matters
  • experienced backend engineer
  • Hands-on with generative & agentic AI in production
  • Strong distributed systems and async background
  • Fluent in Python and data tooling
  • Comfortable in cloud-native environments
  • Evaluation and quality-minded