Senior Autonomy Data Infrastructure and Analytics Software Engineer

Joby Aviation Joby Aviation · Robotics · Santa Cruz, CA · Flight Research

Seeking a Staff Software Engineer to build the data infrastructure and evaluation platform for an autonomy stack in autonomous aircraft. The role involves owning the end-to-end data lifecycle, from flight logs to safety-critical metrics, designing scalable ingestion pipelines, transforming raw data into reliable datasets and metrics, and developing analysis and evaluation platforms to support development, validation, and operational insight. This includes building a large-scale evaluation platform for batch jobs, providing self-serve tools for autonomy teams, and investigating anomalies.

What you'd actually do

  1. Design, implement, and maintain a highly-scalable ingestion pipeline for a heterogeneous fleet of aircraft, owning the data schemas, APIs, and associated tooling.
  2. Build the analysis platform that transforms raw flight and simulation data into trustworthy metrics, curated datasets, and actionable engineering insight.
  3. Develop the scheduling and execution engine for teams to run massive batch jobs on ingested data (e.g. post-processing, flight replay, metrics, “what-if” scenarios)
  4. Define and operationalize evaluation frameworks that allow teams to measure autonomy performance consistently across aircraft, environments, and software versions.
  5. Develop data augmentation scripts to help catalog the data (metadata, anomaly detection, missing data). This might include novel statistical sampling methods and machine learning solutions to maximize the value of collected data.

Skills

Required

  • 5+ years of experience architecting and operating large-scale distributed systems and data infrastructure
  • Expertise in developing databases, query engines, and storage backends for high-frequency, high-cardinality time series data
  • Experience with one of the “Big Three” cloud providers (AWS, GCP, Azure) and Infrastructure as Code (IaC) tools like Terraform or Kubernetes
  • Deep understanding of backend, streaming, and batch processing architectures
  • Strong proficiency in Python, C++, and Git within a Linux-based environment
  • Experience processing high-bandwidth sensor data from robotics or autonomous platforms (e.g., GPS, IMU, Lidar, Radar)
  • Proven ability to document complex technical designs, architectural trade-offs, and implementation roadmaps
  • Champion of software best practices, including rigorous code reviews and mentorship.
  • Excellent communication skills for collaborating with cross-functional teams
  • US Person

Nice to have

  • Expertise in designing distributed batch processing and workflow orchestration systems for large-scale evaluation (e.g. Airflow, Spark, Ray, or Temporal)
  • Familiarity with Databricks
  • Building dataset management infrastructure
  • Experience with autonomous vehicles
  • Experience in processing aircraft data (GPS, inertial, air data, radio data, etc.)
  • Experience with hardware-in-the-loop (HIL) workflows
  • Expert-level software engineering: deep expertise in architecting and writing clean, scalable, and maintainable code
  • Experience with version control and CI/CD platforms, able to manage your software through its entire lifecycle (development, testing, deployment)
  • Experience deploying ML models in a production environment using modern MLOps principles and tools

What the JD emphasized

  • safety-critical metrics
  • safety-critical behavior
  • US export control compliance requirements
  • US Person

Other signals

  • data infrastructure for autonomy stack
  • evaluation platform for autonomy stack
  • end-to-end data lifecycle
  • safety-critical metrics
  • autonomy evaluation frameworks