Senior Software Engineer-sre

Caterpillar Caterpillar · Industrial · Bangalore, Karnataka +1

Site Reliability Engineer responsible for ensuring the reliability, availability, and performance of mission-critical, eCommerce, and platform systems and infrastructure. Collaborates with cross-functional teams to improve system stability, automate tasks, and enhance service delivery. Involves monitoring, troubleshooting, designing automated processes, implementing monitoring tools, working with developers on performance, and ensuring compliance with security and regulatory requirements.

What you'd actually do

  1. Monitor and troubleshoot production and QA systems to identify and resolve performance, scalability, and reliability issues proactively.
  2. Participate in the on-call rotation to provide 24/7 critical incident support for supported systems
  3. Design, implement, and maintain automated processes and tools to streamline deployment and release processes.
  4. Collaborate with cross-functional teams to define, document, and implement operational processes, best practices, and procedures.
  5. Implement and maintain system monitoring tools and dashboards to provide real-time insights into system performance and identify potential issues.

Skills

Required

  • site reliability engineering
  • DevOps
  • QA
  • node/next.js
  • AWS
  • Cloudformation
  • Terraform
  • Github
  • Azure DevOps
  • Python
  • Javascript
  • networking
  • load balancing
  • on prem hosting solutions
  • web application architectures
  • Docker
  • Kubernetes

Nice to have

  • Python (preferred)
  • Javascript (preferred)