Senior AI Engineer

Klaviyo Klaviyo · Enterprise · Boston, MA +1 · IT & Security

Senior AI Engineer to design, build, and scale production-grade LLM integrations and AI-powered automation tools. This role is at the intersection of applied AI research and full-stack software engineering, owning LLM integration architecture, RAG system design, CI/CD pipeline management, and production observability. The goal is to build AI that real people use at scale every day.

What you'd actually do

  1. Design and build LLM-powered applications: Architect and develop production-grade systems leveraging large language models (GPT-4, Claude, Mistral, Llama) for internal automation, knowledge retrieval, and intelligent workflow assistance.
  2. Build and maintain AI integration pipelines: Design robust, scalable API integrations between AI models and internal systems such as HRIS, ticketing, CRM, and knowledge bases, ensuring reliability, low latency, and high availability.
  3. Develop full-stack AI features end-to-end: Own the complete development of AI features across the stack from backend model orchestration and API layers to frontend interfaces that make AI accessible to non-technical employees.
  4. Architect and manage CI/CD pipelines for AI systems: Build and maintain automated deployment pipelines for AI models and services, including model evaluation frameworks, A/B testing infrastructure, and safe rollout strategies.
  5. Implement RAG and retrieval systems: Design and build retrieval-augmented generation systems grounded in Klaviyo internal knowledge, ensuring accuracy, relevance, and up-to-date responses at scale.

Skills

Required

  • 3+ years of software engineering experience, with at least 2 years focused on building AI, ML, or LLM-powered systems in production.
  • Deep expertise with large language models such as OpenAI, Anthropic, Mistral, or Llama, and experience integrating them into production applications via API.
  • Strong full-stack engineering skills across Python (FastAPI, Flask, or Django) and TypeScript or JavaScript, with experience in modern frontend frameworks.
  • Proven experience designing and building CI/CD pipelines for software or ML systems using tools such as GitHub Actions, Jenkins, or CircleCI.
  • Experience with RAG systems, vector databases (Pinecone, Weaviate, pgvector), and embedding models.
  • Proficiency with cloud infrastructure (AWS, GCP, or Azure), containerization (Docker, Kubernetes), and infrastructure-as-code tools such as Terraform or Pulumi.
  • Experience with prompt engineering, model evaluation frameworks, and LLM observability tools such as LangSmith or Weights and Biases.

Nice to have

  • Knowledge retrieval
  • intelligent workflow assistance
  • low latency
  • high availability
  • model orchestration
  • API layers
  • frontend interfaces
  • A/B testing infrastructure
  • safe rollout strategies
  • Klaviyo internal knowledge
  • accuracy
  • relevance
  • up-to-date responses
  • benchmarking
  • fine-tuning of foundation models
  • parameter-efficient fine-tuning techniques
  • coding standards
  • testing frameworks
  • architectural patterns
  • latency monitoring
  • output quality evaluation
  • hallucination detection
  • cost dashboards
  • production alerting
  • Product and Data Science collaboration
  • data scientists
  • ML engineers
  • translation of requirements into technical implementations
  • feedback loop between experimentation and production
  • data privacy best practices
  • guardrails for sensitive data handling
  • compliance with Klaviyo security and governance standards

What the JD emphasized

  • production-grade LLM integrations
  • AI-powered automation tools
  • production observability
  • AI that real people use at scale every day
  • production-grade systems
  • AI integration pipelines
  • full-stack AI features end-to-end
  • CI/CD pipelines for AI systems
  • RAG and retrieval systems
  • AI models
  • AI engineering best practices
  • AI observability and monitoring tooling
  • AI systems
  • building AI, ML, or LLM-powered systems in production
  • integrating them into production applications via API
  • CI/CD pipelines for software or ML systems
  • RAG systems
  • vector databases
  • embedding models
  • cloud infrastructure
  • containerization
  • infrastructure-as-code tools
  • prompt engineering
  • model evaluation frameworks
  • LLM observability tools

Other signals

  • design and build LLM-powered applications
  • Build and maintain AI integration pipelines
  • Develop full-stack AI features end-to-end
  • Architect and manage CI/CD pipelines for AI systems
  • Implement RAG and retrieval systems
  • Evaluate and fine-tune AI models
  • Establish AI engineering best practices
  • Build AI observability and monitoring tooling
  • Drive security and compliance in AI systems