Technical Support Manager - Vehicle Reliability Engineering

Aurora Innovation Aurora Innovation · Robotics · DFW1 · Software Platform Software & Services

This role manages a Vehicle Reliability Engineering (VRE) team focused on the systemic health and operational readiness of Aurora's autonomous fleet. It involves proactive incident prevention, root cause analysis, driving automation for diagnostics, systems monitoring, stakeholder communication, and process development, with a focus on SRE-based support and incident response.

What you'd actually do

  1. Manage, mentor, and scale a team of operations engineers and reliability specialists focused on the systemic health and operational readiness of the Aurora autonomous fleet.
  2. Shift the team’s focus from reactive troubleshooting to proactive incident prevention by driving deep Root Cause Analysis (RCA) and implementing long-term hardware/software fixes.
  3. Partner with engineering teams to build and deploy automated diagnostic tools, scripts, and alerting systems that reduce manual intervention and improve vehicle uptime.
  4. Oversee the "nerve center" of fleet health, utilizing telemetry, Linux command-line tools, and data dashboards to predict and resolve sensor (Lidar/Radar) and compute failures before they impact field operations.
  5. Act as the primary translation layer between Operations, Product, and Infrastructure engineering, reporting to senior management and keeping stakeholders aligned on reliability initiatives.

Skills

Required

  • 5+ years of experience managing Site Reliability Engineering (SRE), Technical Operations, or Sustaining Engineering teams
  • 3+ years of direct people management experience
  • Proficient in deep technical areas like: Linux environments, IT systems, hardware/software integrations, networking, and sensor suites (Lidar/Radar)
  • Proven experience establishing incident response frameworks, automation protocols, and performance metrics
  • Excellent communication skills
  • Strong bias for action and the ability to make high-pressure decisions

Nice to have

  • Deep technical knowledge of hardware/software integration, including experience troubleshooting sensors (Lidar/Radar) and industrial computers.
  • Expert-level Linux skills: 5+ years of experience in Linux administration, command-line troubleshooting, and shell scripting.
  • Automation Mindset: Previous experience with Python, Bash, or Go for automating operational tasks and support workflows.
  • Experience in the Autonomous Vehicle (AV), robotics, or aerospace industries.
  • Knowledge of computer networking (TCP/IP, UDP, VLANs) and data log analysis.
  • Troubleshoot and diagnose software and hardware issue escalations involving Linux environments, Lidar, Radar, and on-vehicle compute systems.

What the JD emphasized

  • proactive incident prevention
  • automated diagnostic tools
  • sensor (Lidar/Radar) and compute failures
  • SRE-based support processes
  • Linux environments
  • sensor troubleshooting (Lidar/Radar)