Staff Software Engineer, Continuous Learning

Aurora Innovation Aurora Innovation · Robotics · Mountain View, CA · Software Autonomy Sensing

Staff Software Engineer on the Autonomy Data: Continuous Learning team at Aurora, focusing on improving dataset quality and models using foundation models and RLHF techniques. The role involves owning model training and inference pipelines, establishing semi-automated evaluation mechanisms, and expanding foundation model approaches for sourcing events. Requires Python, C++, cloud environments, and knowledge in computer vision, LLMs, or deep learning.

What you'd actually do

  1. Improve our dataset quality by establishing semi-automated evaluation mechanisms leveraging state of the art models as well as RLHF techniques
  2. Expand our foundation model approach for sourcing interesting events to millions of miles
  3. Own model training and inference pipelines for all core Autonomy models
  4. Collaborate across teams and functions (product, program, operations, data science) to drive projects from inception to delivery

Skills

Required

  • BS in Computer Science, or a related field
  • Excellent Python, Proficient C++ programming and software design skills
  • Experience with storage and database management systems (e.g., one of SQL, no-SQL, protobuf, parquet, HDFS)
  • Experience working in a cloud environment (e.g., AWS, GCP, Azure, etc)
  • Knowledge and experience in at least one of computer vision, LLMs, or deep learning for other applications

Nice to have

  • Excellent C++ programming and software design skills
  • Distributed System design patterns (high availability, scaling, load balancing, caching, sharding etc.)
  • PyTorch and GPU programming experience

What the JD emphasized

  • state of the art foundation models
  • RLHF techniques
  • model training and inference pipelines
  • computer vision, LLMs, or deep learning

Other signals

  • improving models with high quality data
  • leveraging state of the art foundation models
  • RLHF techniques
  • model training and inference pipelines