Senior Site Reliability Engineer

Anduril Anduril · Defense · Costa Mesa, CA +1 · AFS : Counter Intrusion Engineering : Software Engineering

Anduril Industries is a defense technology company seeking a Senior Site Reliability Engineer to support their Counter Intrusion and Air Defense teams. The role involves architecting, deploying, and maintaining cloud infrastructure (AWS, Azure, Kubernetes), promoting SRE best practices, and improving operational capabilities through root cause analysis and tooling. The position requires experience with Kubernetes, cloud services, infrastructure as code, and CI/CD pipelines, with a strong emphasis on system resilience and performance monitoring.

What you'd actually do

  1. Architect, deploy and maintain infrastructure with cloud providers and Kubernetes (EKS)
  2. Collaborate with multi-disciplined teams to define and execute on internal and external deployments
  3. Promote SRE best practices in system resilience, performance monitoring and high availability
  4. Design, develop, and deliver solutions using infrastructure as code with tools like Terraform and Python
  5. Develop and maintain CI/CD pipelines for automated deployment

Skills

Required

  • Kubernetes ecosystem (Docker, Helm, ArgoCD, Terraform)
  • cloud services (AWS/Azure)
  • Go, Python, Rust, or C++
  • data-driven root cause analysis on complex systems
  • train peers or customers on the operation of a product
  • Computer Science degree or equivalent

Nice to have

  • managing Kubernetes clusters of hundreds of nodes
  • performance improvement techniques, metrics and alerting
  • KubeVirt, qemu, virtualization and hypervisor technologies
  • low-level frameworks, Linux and databases
  • Excellent written and verbal communication skills

What the JD emphasized

  • Holding active U.S. secret security clearance
  • 6+ years of engineering experience