Sr. Sde- ML Data Infrastructure, Frontier AI Robotics

Amazon Amazon · Big Tech · San Francisco, CA · Software Development

Senior Software Development Engineer focused on ML Data Infrastructure for Frontier AI Robotics at Amazon. The role involves building and maintaining scalable data infrastructure, designing dataset management systems, developing visualization tools, and implementing advanced data filtering techniques to support cutting-edge AI robotics research. Collaboration with science teams is key, requiring both infrastructure development and hands-on technical contribution to data preparation.

What you'd actually do

  1. Build and maintain scalable data infrastructure to support cutting-edge AI robotics research.
  2. Design dataset management systems including automated pipelines for data ingestion, processing, and curation.
  3. Develop visualization and inspection tools for dataset exploration and quality assessment.
  4. Research and implement state-of-the-art data filtering techniques including deduplication, quality scoring, and model-based filtering methods.
  5. Collaborate directly with science teams to support research projects through both infrastructure development and hands-on technical contribution to data preparation workflows.

Skills

Required

  • 5+ years of non-internship professional software development experience
  • 5+ years of programming with at least one software programming language experience
  • 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • Experience as a mentor, tech lead or leading an engineering team
  • Strong software engineering background with full-stack development experience
  • Deep understanding of machine learning fundamentals, particularly large-scale model training
  • Expertise in distributed systems, cloud computing, and scalable data processing
  • Experience with data pipeline design, ETL processes, and data management systems
  • Proficiency in translating academic concepts into production systems

Nice to have

  • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Bachelor's degree in computer science or equivalent
  • Experience with dataset curation and quality assessment techniques
  • Knowledge of computer vision and multimodal data processing
  • Background in research environments or supporting ML research workflows
  • Experience with data visualization and annotation tooling
  • Familiarity with modern data filtering and deduplication methodologies

What the JD emphasized

  • cutting-edge AI robotics research
  • frontier foundation models
  • end-to-end learned systems
  • multimodal perception
  • sophisticated manipulation strategies
  • state-of-the-art data filtering techniques

Other signals

  • building the future of intelligent robotics
  • frontier foundation models
  • end-to-end learned systems
  • multimodal perception
  • sophisticated manipulation strategies