Software Engineer L5 - Edge Platform Automation & Validation

Netflix Netflix · Big Tech · United States · Remote · Engineering

Software Engineer L5 to own and evolve automated CI/CD and validation infrastructure for Netflix's Open Connect edge appliances, ensuring security, reliability, and performance. The role involves improving failure triage with AI-assisted tooling and leading engineers.

What you'd actually do

  1. Own and build scalable testing infrastructure and end-to-end automated validation for edge appliances, covering functional, resiliency, performance, and upgrade and rollback testing, with high reliability, strong observability, and clear release gates, including tests that validate the platform can meet scaling and performance requirements under production-like workloads.
  2. Improve failure triage with AI-assisted tooling that reduces time-to-detection and time-to-resolution.
  3. Lead and mentor engineers building and maintaining test automation and release qualification.
  4. Partner with OS, security, hardware, and application teams to ensure validation keeps pace with rapid product development.
  5. Debug complex regressions across hardware/firmware/OS boundaries and collaborate cross functionally to drive fixes to resolution.

Skills

Required

  • Python
  • Rust
  • Go
  • Shell scripting
  • Linux
  • FreeBSD
  • CI/CD
  • Test automation
  • Cloud services
  • Distributed systems
  • Debugging
  • Technical leadership

Nice to have

  • AI tools for operational triage
  • Log clustering
  • Anomaly detection
  • Guardrails
  • Fallback paths
  • Auditability
  • Performance tooling
  • perf
  • flamegraphs
  • bpftrace/eBPF
  • dtrace
  • fio
  • network benchmarking
  • Open-source collaboration
  • Hardware lab automation
  • Fleet provisioning
  • PXE boot
  • Imaging
  • Remote power control
  • Serial console access
  • Rack automation
  • Incident response
  • Postmortems
  • Root cause analysis
  • Preventative engineering
  • BIOS validation
  • Firmware validation
  • Firmware rollouts
  • Hardware vendor collaboration

What the JD emphasized

  • 10+ years software engineering experience (or equivalent depth), including ownership of CI/CD systems and architecting large scale test automation.
  • Strong coding ability in Python, Rust and or Go, with comfort writing shell scripts.
  • Deep hands-on experience with Linux and/or FreeBSD in systems contexts (boot, networking and storage).
  • Strong ability to design, build, and operate cloud services that support CI/CD and test automation, including maintaining service reliability, scalability, observability, and cost efficiency.
  • Experience designing automated test frameworks for reliability, performance, hardware-in-loop, integration testing.
  • Proven ability to provide technical leadership across teams through setting standards, mentoring, and owning roadmaps.
  • Experience with modern CI systems and build and release pipelines such as GitHub Actions, Jenkins or similar tools.
  • Strong debugging skills across distributed systems and low-level systems boundaries using logs, metrics, tracing, and performance tooling
  • Proficiency working on highly distributed systems