Site Reliability Engineer

Visa Visa · Fintech · Bengaluru, India, IN

Visa is seeking a Site Reliability Engineer (Data Reliability Engineer) to design, build, and evolve cloud-native, containerized infrastructure for their data products and services. The role focuses on advancing platform maturity, ensuring availability, security, scalability, and reliability of the data ecosystem. Responsibilities include deep expertise in systems design, cloud infrastructure, networks, databases, modern data technologies, infrastructure automation, and high-scale distributed systems. Key qualifications include experience with Infrastructure as Code (Terraform), Kubernetes, CI/CD, reliability engineering concepts, database technologies, observability tools, and automation scripting.

What you'd actually do

  1. Designing, building, and evolving cloud‑native, containerized infrastructure that powers our data products and services.
  2. Advancing our platform maturity by supporting cross-functional squads, leading complex technical initiatives, and ensuring the availability, security, scalability, and reliability of our data ecosystem.
  3. Bringing deep expertise in systems design, cloud infrastructure, networks, databases, and modern data technologies.
  4. Contributing hands-on experience with complex technology adoption, infrastructure automation, and high-scale distributed systems.
  5. Contributing to cross-functional technical initiatives from design through production under guidance.

Skills

Required

  • Bachelor's degree in Computer Science, Engineering, or a related field OR 3+ years of relevant work experience
  • Hands-on experience designing and operating cloud‑native infrastructure
  • Knowledge of Infrastructure as Code (Terraform)
  • Good understanding of Kubernetes and container orchestration concepts
  • Familiarity with CI/CD systems, pipeline configuration, automation, and secure deployment practices
  • Foundational competencies in reliability engineering concepts (SLOs, error budgets, incident response)
  • Basic understanding of database technologies including SQL, NoSQL, and common data storage patterns
  • Experience using observability tools and stacks (Prometheus, Grafana, OpenTelemetry, ELK/EFK, Datadog, or similar)
  • Basic automation experience using Bash, Python, or Ansible-like tools
  • Working knowledge of software engineering practices including version control, testing, code reviews, and common design patterns
  • Ability to contribute to cross-functional technical initiatives from design through production under guidance
  • Experience supporting technology adoption and platform improvements across teams
  • Capability to follow and help implement infrastructure standards, best practices, and architectural guidelines
  • Comfortable working in partially ambiguous situations, escalating risks appropriately, and learning to make sound technical trade-offs
  • Strong problem-solving skills with demonstrated ability to reduce toil, address technical debt, and improve system stability
  • Experience participating in on-call rotations, incident response, and post-incident reviews
  • Clear written and verbal English communication skills
  • Ability to collaborate effectively with data engineers, platform engineers, SREs, security teams, and product teams
  • Capable of producing clear technical documentation and contributing to architectural discussions and decision records