Member of Technical Staff - Data Platform Engineer, Frontier AI Robotics

Amazon Amazon · Big Tech · San Francisco, CA · Software Development

The role is for a Data Platform Engineer focused on building and maintaining the data infrastructure for robotics manipulation research at Amazon's Frontier AI Robotics team. This involves creating systems to process raw robot data into trainable datasets, including streaming ingestion, data curation, quality controls, and tools for researchers. The role requires full-stack development experience in a cloud-native environment and collaboration with researchers.

What you'd actually do

  1. build and maintain the data infrastructure that powers our robotics manipulation research
  2. work alongside our existing team of platform engineers to extend the systems that turn raw robot session data into curated, trainable episodes
  3. own streaming ingestion pipelines, platform and schema design, heterogeneous data sources, data curation and quality controls, full-stack inspection and dataset-builders that researchers and human annotators actually use, and tools to let scientists go from dataset to training job without leaving the platform
  4. work on the full stack in a fast iteration cycle while working with researchers as close customers
  5. ship full-stack data infrastructure real users depend on, treat researchers as collaborators rather than customers, and have a strong bias toward iteration in a flat org where engineers pick up science-driven work directly instead of waiting for approval layers

Skills

Required

  • 5+ years of non-internship professional software development experience
  • 5+ years of owning the full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Strong software engineering background with full-stack development experience
  • Expertise in distributed systems, cloud computing, and scalable data processing
  • Experience with data pipeline design, ETL processes, and data management systems

Nice to have

  • Experience as a mentor, tech lead or leading an engineering team
  • 5+ years of programming with at least one software programming language experience
  • 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • Bachelor's degree in computer science or equivalent
  • Experience with dataset curation and quality assessment techniques
  • Knowledge of computer vision and multimodal data processing
  • Deep understanding of machine learning fundamentals, particularly large-scale model training
  • Background in research environments or supporting ML research workflows
  • Experience with data visualization and annotation tooling
  • Familiarity with modern data filtering and deduplication methodologies
  • Proficiency in translating academic concepts into production systems

What the JD emphasized

  • full-stack development experience
  • full-stack data infrastructure real users depend on
  • researchers as collaborators rather than customers
  • strong bias toward iteration

Other signals

  • building the future of intelligent robotics through frontier foundation models
  • developing sophisticated perception systems
  • creating adaptive manipulation strategies
  • building systems that scale to meet the demands of Amazon's global operations
  • data infrastructure that powers our robotics manipulation research
  • turn raw robot session data into curated, trainable episodes
  • streaming ingestion pipelines
  • data curation and quality controls
  • dataset-builders that researchers and human annotators actually use
  • tools to let scientists go from dataset to training job without leaving the platform