Senior Site Reliability Engineer

Workday Workday · Enterprise · USA.VA.Reston

Senior Site Reliability Engineer responsible for operating, monitoring, automating, and maintaining a Kubernetes-based platform, ensuring high availability, scalability, and security for Workday's AI platform. The role involves infrastructure automation, CI/CD pipelines, incident handling, and observability.

What you'd actually do

  1. Ensuring the Workday Kubernetes based platform is maintained, healthy, and ensures high availability for our customers through, infrastructure automation, CI/CD pipelines, reporting, incident handling and response, and observability tools.
  2. Maintain the overall platform: maintain core platform components, ensuring high availability, scalability, and security.
  3. Automate and optimize: Automate infrastructure provisioning, configuration management, and application deployments using tools like Terraform and Argo CD.
  4. Troubleshooting and support: Provide support and solve for platform-related issues, working closely with development teams to resolve problems.
  5. Security and compliance: Implement and maintain security standard methodologies for the platform, ensuring compliance with industry standards.

Skills

Required

  • 5 years of hands-on experience working with large scale cloud infrastructure, automation, and overall DevOps methodologies
  • Bachelor's degree in a computer related field or equivalent work experience
  • Infrastructure as code: Proficiency in infrastructure automation tools like Terraform.
  • CI/CD: Experience with building, maintaining, and consuming CI/CD pipelines and tools like Argo CD.
  • Problem-solving: Strong analytical and problem-solving skills.
  • Communication: Excellent communication and collaboration skills.

Nice to have

  • Strong understanding of Kubernetes
  • Amazon Web Services proficiency working in a production environment
  • Proficiency in at least one programming language such as C#, Python, Ruby, Rust, or Go programming language proficiency
  • Experience with security auditing and compliance frameworks.
  • Experience working in air gapped cloud regions

What the JD emphasized

  • security clearance at the TS/SCI w/CI Poly level
  • ability to obtain and maintain a U.S. government issued security clearance
  • active TS/SCI w/CI Poly is preferred