Sre/dev Ops Engineer (hybrid, Sunnyvale)

CrowdStrike CrowdStrike · Enterprise · Sunnyvale, CA

SRE/Dev Ops Engineer role focused on building and maintaining production infrastructure, CI/CD pipelines, control planes, and observability for a cybersecurity platform. The role emphasizes automation, reliability, security, and cost management across multiple cloud providers and regions, with a mention of using AI-driven automation for operations.

What you'd actually do

  1. Run production infrastructure - Deploy, upgrade, and maintain platform services across multiple clouds and regions on Kubernetes.
  2. Build and maintain CI/CD pipelines - Make it safe and fast to ship infrastructure changes using GitOps workflows and release automation.
  3. Build control planes - Create the APIs and tooling that make provisioning and scaling repeatable and self-service.
  4. Own capacity planning - Track usage, forecast growth, right-size clusters, and keep infrastructure costs in check.
  5. Build observability - Set up metrics, dashboards, and alerts using Prometheus and Grafana. Write runbooks that make on-call clear and actionable.

Skills

Required

  • Kubernetes
  • CI/CD
  • Infrastructure-as-code
  • GitOps
  • Observability
  • Prometheus
  • Grafana
  • Distributed systems
  • Cloud infrastructure
  • Security best practices
  • Automation

Nice to have

  • Security platforms
  • Telemetry pipelines
  • Internal developer platforms
  • Self-service tooling
  • Service mesh (Istio, Linkerd)
  • Workflow orchestration (Temporal, Argo Workflows)
  • Distributed tracing
  • Disaster recovery automation
  • Cybersecurity SaaS
  • Go programming

What the JD emphasized

  • 8+ years in DevOps, SRE, or platform engineering.
  • Hands-on experience running stateful distributed systems on Kubernetes in production.
  • CI/CD experience - Building and owning pipelines using GitHub Actions, Jenkins, Tekton, or similar tools.
  • Infrastructure-as-code skills - Terraform, Pulumi, or Crossplane, no manual configuration.
  • GitOps experience - ArgoCD or Flux for managing infrastructure deployments.
  • Observability skills - Prometheus, Grafana, and distributed tracing tools like Jaeger or OpenTelemetry.
  • Security mindset - You implement auth, encryption, secret management, and network policies as part of normal work.
  • Multi-cloud or multi-region experience - you have managed infrastructure across providers or regions.