Software Development Engineer, Data Platform , Fauna

Amazon Amazon · Big Tech · NY +1 · Software Development

Software Development Engineer to build foundational data systems for robotics and machine learning development, focusing on infrastructure for data collection, storage, processing, and transformation from robots. The role involves owning services, APIs, and distributed systems, partnering with scientists and engineers, and developing systems for ML training data preparation.

What you'd actually do

  1. Design, build, and operate scalable services and distributed systems for ingesting, storing, and serving large volumes of multimodal robotics data
  2. Own components end-to-end: design, implementation, testing, deployment, monitoring, and on-call
  3. Build well-designed APIs and tooling that let researchers and engineers discover, query, and process large datasets efficiently
  4. Develop real-time and batch processing systems for preparing data for ML training
  5. Partner with science and engineering teams to translate evolving requirements into reusable platform capabilities

Skills

Required

  • 3+ years of non-internship professional software development experience
  • 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • Bachelor's degree or foreign equivalent in Computer Science, Engineering, Mathematics, or a related field
  • Experience with programming languages such as Python, Java, C++
  • Experience building and operating distributed services or APIs in production
  • Experience with large-scale data storage technologies (object stores, columnar formats, time-series or relational databases)

Nice to have

  • Track record of building internal data platforms or tooling that accelerate ML for science and engineering teams
  • Experience partnering directly with researchers or scientists to turn experimental workflows into durable platform capabilities
  • Familiarity with ML workflows and how data systems support model training
  • Experience with robotics, IoT, or other high-volume sensor data (time-series, video, point clouds)
  • Experience with cloud data platforms (AWS, GCP) and hybrid on-prem/cloud architectures
  • Proficiency with SQL and query optimization for large datasets

What the JD emphasized

  • partner directly with applied scientists and robotics engineers
  • data systems that power our robotics and machine learning development
  • preparing data for ML training

Other signals

  • data platform for robotics and ML
  • collecting, storing, processing, and transforming vast amounts of data generated by robots
  • design and implement infrastructure
  • partner directly with applied scientists and robotics engineers
  • develop real-time and batch processing systems for preparing data for ML training