Senior Platform Engineer, Runtime (auth0)

Okta Okta · Enterprise · Toronto, ON · SW Eng - Infrastructure-672

Okta is seeking a Senior Platform Engineer to build and maintain the foundational infrastructure for the Auth0 platform, focusing on distributed systems, availability, and customer value. The role involves advanced Kubernetes troubleshooting, defining Terraform standards, managing deployment standards with Kustomize, orchestrating GitOps workflows with ArgoCD, architecting a testing automation framework, overseeing control plane infrastructure, and leading service mesh architecture. The ideal candidate has 5+ years of experience in infrastructure or platform engineering, proficiency in Golang, deep understanding of Kubernetes, and DevOps experience with cloud-native technologies.

What you'd actually do

  1. Serve as the ultimate technical authority and final escalation point for advanced Kubernetes cluster troubleshooting, ensuring the continuous stability and resilience of our foundational infrastructure.
  2. Define and enforce organization-wide standards and architectural guidelines for Terraform. You will maintain core provider functionality and oversee critical security protocols, including automated credential rotation machinery.
  3. Develop and maintain application deployment standards, managing Kustomize and built-ins to standardize and streamline deployments across all product teams.
  4. Orchestrate complex GitOps workflows leveraging ArgoCD and Argo Workflows. You will take full ownership of Environment Lifecycle Management, handling everything from environment creation, synchronization, and user onboarding to failover procedures and safe deletion.
  5. Architect the Testing Automation Framework to support dynamic environment provisioning and destruction. This will enable advanced quality assurance practices, such as chaos testing, while empowering teams to manage their own test workflows.

Skills

Required

  • Golang
  • Kubernetes
  • containerization
  • DevOps
  • cloud-native technologies
  • TCP/IP
  • HTTP
  • AWS
  • Azure
  • Terraform
  • Tier-1 Service mindset

Nice to have

  • PostgreSQL
  • custom Kubernetes controllers
  • Terraform providers
  • gRPC

What the JD emphasized

  • ultimate technical authority
  • final escalation point
  • advanced Kubernetes cluster troubleshooting
  • continuous stability and resilience
  • Define and enforce organization-wide standards
  • architectural guidelines for Terraform
  • critical security protocols
  • automated credential rotation machinery
  • application deployment standards
  • standardize and streamline deployments
  • Orchestrate complex GitOps workflows
  • Environment Lifecycle Management
  • failover procedures
  • safe deletion
  • Testing Automation Framework
  • dynamic environment provisioning and destruction
  • chaos testing
  • Control Plane infrastructure
  • operational stability
  • intra-space network architecture
  • Service Mesh
  • secure and efficient service-to-service communication
  • on-call rotation
  • 5+ years of software development experience
  • infrastructure or platform
  • Golang
  • Kubernetes and containerization
  • DevOps experience
  • cloud-agnostic, cloud-native technologies
  • TCP/IP and HTTP fundamentals
  • managing cloud resources (AWS/Azure) via code
  • automated provisioning
  • Terraform at scale
  • Tier-1 Service mindset
  • edge cases, network partitions, and partial failures gracefully
  • deliver work incrementally
  • feedback and iterate over solutions
  • highly reliable, maintainable, scalable and secure
  • ownership, accountability, and attention to detail
  • fully-distributed team
  • relentless passion for ownership, accountability, and diving into complexity
  • highly reliable and performant systems