Senior/staff Software Engineer, ML Data Infrastructure

Nuro Nuro · Robotics · CA · Offboard Infrastructure

Nuro is seeking a Senior/Staff Software Engineer to build and scale ML data infrastructure for autonomous driving. This role involves designing and developing large-scale data pipelines, storage systems for evaluation metrics, and data annotation tools, with a focus on applied ML techniques to improve data quality and discovery.

What you'd actually do

  1. Design and develop unified, introspectable, large-scale batch and streaming data pipelines that can ingest and process data across a wide range of use cases relevant to evaluation.
  2. Create and implement a storage system capable of accommodating both the large volume and diverse range of evaluation and performance metrics.
  3. Construct intuitive dashboards and reports to present evaluation results, facilitating straightforward comparisons that highlight both improvements and regressions of the ML components and the overall system.
  4. Develop and maintain continuous testing and monitoring systems to guarantee the integrity and resilience of our data and associated data pipelines.
  5. Develop data mining tools with applied ML techniques to support data discovery needs from Autonomy including Perception, Behavior, and Mapping

Skills

Required

  • Python
  • Experience working with large-scale data and building scalable & reliable systems/data pipelines
  • Experience setting team or project product and technical vision, timelines, and prioritization
  • Technical Lead experience
  • Mentoring and support junior engineers
  • Ability and willingness to deep dive into implementation
  • Driving technical standards and best practices

Nice to have

  • C++
  • GCP, GCS, BigQuery, or PostgreSQL
  • data engineering, and its tooling and best practices
  • batch and streaming data processing, warehousing, and analytics solutions
  • large-scale distributed data systems
  • system & framework design
  • data workflow orchestration platforms

What the JD emphasized

  • large-scale batch and streaming data pipelines
  • large volume and diverse range of evaluation and performance metrics
  • applied ML techniques
  • Scale data annotation labels with applied State-of-the-art ML techniques

Other signals

  • ML-first approach to autonomous driving
  • Scalable and reliable data infrastructure for training and evaluation data
  • Data mining tools with applied ML techniques
  • Scale data annotation labels with applied State-of-the-art ML techniques