Senior Data Engineer, Amazon Customer Service

Amazon Amazon · Big Tech · Austin, TX · Data Science

Senior Data Engineer role focused on building and maintaining data infrastructure for AI/ML and LLM-based systems, specifically agentic workflows and RAG architectures. The role involves designing data pipelines, vector databases, embedding pipelines, and developing observability/evaluation pipelines for LLM features.

What you'd actually do

  1. Design, develop and maintain scaled, automated, user-friendly systems, reports, dashboards, etc.
  2. Partner with operations/business teams/economist/ML teams to consult, develop and implement KPI's, automated reporting/process solutions and data infrastructure improvements to meet business needs.
  3. Build and maintain data infrastructure for AI agent systems, including vector databases, embedding pipelines, and retrieval-augmented generation (RAG) data stores.
  4. Design data architectures that enable agentic workflows - structured data access layers, tool-use APIs, context management systems that AI agents consume autonomously, self-serve analytics.
  5. Develop observability and evaluation pipelines for LLM-powered features, including tracking model performance, hallucination rates, latency, and cost metrics at scale.

Skills

Required

  • 5+ years of data engineering experience
  • Experience with data modeling, warehousing and building ETL pipelines
  • Experience with SQL
  • Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
  • Experience mentoring team members on best practices
  • Experience with AI/ML technologies

Nice to have

  • Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
  • Experience operating large data warehouses

What the JD emphasized

  • Experience with AI/ML technologies

Other signals

  • building data infrastructure for AI/ML and LLM-based systems
  • designing pipelines that feed agentic workflows and retrieval-augmented generation (RAG) architectures
  • build and maintain data infrastructure for AI agent systems, including vector databases, embedding pipelines, and retrieval-augmented generation (RAG) data stores
  • design data architectures that enable agentic workflows
  • develop observability and evaluation pipelines for LLM-powered features