Site Reliability Engineer (sre)

Comcast Comcast · Media · Downingtown, PA +4

Site Reliability Engineer (SRE) role focused on developing and supporting tools and applications for network diagnostics and troubleshooting. Responsibilities include Infrastructure as Code (Terraform) for AWS and Kubernetes, CI/CD pipeline maintenance, monitoring, and on-call support. This is not an AI/ML development role but rather an infrastructure and operations role within a technology company.

What you'd actually do

  1. Providing Infrastructure as Code solutions for a small cohesive group within Comcast
  2. Using Terraform to configure AWS Infrastructure, Kubernetes cluster provisioning and application provisioning
  3. Working with and supporting developers to help maintain/define best practices
  4. Configuring, watching, tuning and responding to monitoring events
  5. Supporting an on-call rotation with the SRE team

Skills

Required

  • Infrastructure as Code
  • Terraform
  • AWS
  • Kubernetes
  • CI/CD
  • Monitoring systems
  • Git

Nice to have

  • bash scripting
  • python scripting
  • troubleshooting applications
  • troubleshooting networking
  • Concourse
  • GoCD
  • distributed systems