Sre Operations Engineer

Okta Okta · Enterprise · Bangalore, India · Tech Ops-610

SRE Operations Engineer at Okta, focusing on ensuring the smooth operation and availability of the Customer Identity Cloud. This role involves maintaining production systems, triaging requests, monitoring platform health, and escalating issues, with potential growth into Site Reliability Engineering.

What you'd actually do

  1. Monitors Platform health and take steps to alleviate issues related to deployment and operations
  2. Executes operational work including updating/patching and maintaining the Engineering Service Desk queue
  3. Responsible for ensuring team requests are triaged and/or actioned in a timely manner
  4. Escalation point for Platform issues from customer support teams
  5. Execute runbooks and update processes as required

Skills

Required

  • 1+ years in a Cloud Operations role
  • 1+ years in a production environment supporting large-scale, mission-critical applications
  • General platform infrastructure knowledge, including high availability / load balancing concepts, routers, firewalls and storage subsystems
  • Sound understanding of protocols/technologies like HTTP, SSL, SSH and Kubernetes
  • Familiarity with a variety of open source technologies and tools like MongoDB, NodeJS
  • Experience with monitoring and troubleshooting techniques
  • Ability to communicate clearly with a diverse range of stakeholders across multiple domains
  • Multi tasking and time management skills

Nice to have

  • Knowledge of terraform is good to have
  • Familiarity with a cloud platforms like AWS and Azure is desired
  • Linux fundamentals and knowledge of tools like Datadog are good to have
  • Interest and/or an understanding of programming e.g. golang, shell scripting

What the JD emphasized

  • production systems remain operational at all times
  • customer availability expectations are exceeded