What you'd actually do

Own the end‑to‑end lifecycle (design, provisioning, upgrades, maintenance, and decommissioning) of core platform components, including:

Design platform components to be resilient by default, applying SRE principles such as:

Lead the design and implementation of infrastructure bootstrap orchestration, including:

Drive Infrastructure‑as‑Code and GitOps‑first practices to ensure:

Apply and promote SRE operational excellence practices, including:

Skills

Required

public cloud platforms (AWS preferred, Azure also considered)
operating and administering Kubernetes at scale in production environments
container orchestration platforms
cloud architecture fundamentals (networking, IAM/security concepts, and reliability patterns)
Infrastructure as Code (Terraform preferred) and automation-first workflows
GitOps practices and CI/CD pipelines
troubleshooting skills for distributed systems, including root‑cause analysis and reliability improvements
observability concepts and practices (monitoring, logging, alerting, tracing)

Nice to have

Service Mesh technologies (Istio preferred, App Mesh or Linkerd)
working with critical or mission‑critical systems
applying SRE principles (operational readiness, incident management, runbooks, toil reduction)
AWS certifications

About Us Visa is a world leader in payments technology, facilitating transactions between consumers, merchants, financial institutions and government entities across more than 200 countries and territories, dedicated to uplifting everyone, everywhere by being the best way to pay and be paid.

At Visa, you'll have the opportunity to create impact at scale — tackling meaningful challenges, growing your skills and seeing your contributions impact lives around the world.

Join Visa and do work that matters – to you, to your community, and to the world. Progress starts with you.

Job Description

Join Pismo’s Platform squad within the SRE Tribe, dedicated to owning and evolving the containerized platform that underpins critical workloads. You’ll work cross‑functionally to ensure our platform is reliable, scalable, secure, and easy to operate, focusing on Kubernetes at scale and cloud architecture.

What You’ll Do

Own the end‑to‑end lifecycle (design, provisioning, upgrades, maintenance, and decommissioning) of core platform components, including:

Cloud infrastructure primitives
Kubernetes clusters and cluster services
Networking, ingress, and service discovery
Service Mesh and supporting data‑plane components

Design platform components to be resilient by default, applying SRE principles such as:

Fault isolation and graceful degradation
Capacity planning and saturation control
Reduced operational toil and clear failure modes

Lead the design and implementation of infrastructure bootstrap orchestration, including:

Automated cluster and environment provisioning
Deterministic, repeatable platform bring‑up and teardown
Dependency‑aware orchestration across cloud, network, and Kubernetes layers

Drive Infrastructure‑as‑Code and GitOps‑first practices to ensure:

Platform components are reproducible and auditable
Changes are automated, testable, and reversible
Manual intervention is minimized or eliminated
Identify automation gaps and lead initiatives that reduce human effort, onboarding time, and operational risk.

Apply and promote SRE operational excellence practices, including:

Clear ownership and runbooks for platform components
Participation in on‑call rotation as a platform reliability escalation point
Incident response, post‑incident reviews, and problem management
Improve day‑2 operations by standardizing upgrade/rollback strategies and reducing MTTD/MTTR.
Ensure platform operations align with security, compliance, and internal control requirements.
Collaborate with engineering teams across the organization to influence platform adoption, reliability standards, and cloud‑native best practices.

This is a remote position. A remote position does not require job duties be performed within proximity of a Visa office location. Remote positions may be required to be present at a Visa office with scheduled notice. #LI-Remote

Qualifications

For this role, you must be based in Brazil.

Language Skills Proficiency in English at B2 level or above (Upper-Intermediate)

Technical Skills

Strong hands‑on experience with public cloud platforms (AWS preferred, Azure also considered).
Proven experience operating and administering Kubernetes at scale in production environments.
Strong experience with container orchestration platforms and cloud architecture fundamentals (networking, IAM/security concepts, and reliability patterns).
Experience with Infrastructure as Code (Terraform preferred) and automation‑first workflows.
Familiarity with GitOps practices and CI/CD pipelines.
Strong troubleshooting skills for distributed systems, including root‑cause analysis and reliability improvements.
Experience with observability concepts and practices (monitoring, logging, alerting, tracing).

Preferred Qualifications

Experience with Service Mesh technologies (Istio preferred, App Mesh or Linkerd).
Experience working with critical or mission‑critical systems.
Strong background applying SRE principles (operational readiness, incident management, runbooks, toil reduction).
AWS certifications.

Visa is an EEO Employer

Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.

At Visa, you'll have the opportunity to create impact at scale — tackling meaningful challenges, growing your skills and seeing your contributions impact lives around the world.

Join Visa and do work that matters – to you, to your community, and to the world. Progress starts with you.

Job Description

What You’ll Do

Own the end‑to‑end lifecycle (design, provisioning, upgrades, maintenance, and decommissioning) of core platform components, including:

Cloud infrastructure primitives
Kubernetes clusters and cluster services
Networking, ingress, and service discovery
Service Mesh and supporting data‑plane components

Design platform components to be resilient by default, applying SRE principles such as:

Fault isolation and graceful degradation
Capacity planning and saturation control
Reduced operational toil and clear failure modes

Lead the design and implementation of infrastructure bootstrap orchestration, including:

Automated cluster and environment provisioning
Deterministic, repeatable platform bring‑up and teardown
Dependency‑aware orchestration across cloud, network, and Kubernetes layers

Drive Infrastructure‑as‑Code and GitOps‑first practices to ensure:

Platform components are reproducible and auditable
Changes are automated, testable, and reversible
Manual intervention is minimized or eliminated
Identify automation gaps and lead initiatives that reduce human effort, onboarding time, and operational risk.

Apply and promote SRE operational excellence practices, including:

Clear ownership and runbooks for platform components
Participation in on‑call rotation as a platform reliability escalation point
Incident response, post‑incident reviews, and problem management
Improve day‑2 operations by standardizing upgrade/rollback strategies and reducing MTTD/MTTR.
Ensure platform operations align with security, compliance, and internal control requirements.
Collaborate with engineering teams across the organization to influence platform adoption, reliability standards, and cloud‑native best practices.

Qualifications

For this role, you must be based in Brazil.

Language Skills Proficiency in English at B2 level or above (Upper-Intermediate)

Technical Skills

Strong hands‑on experience with public cloud platforms (AWS preferred, Azure also considered).
Proven experience operating and administering Kubernetes at scale in production environments.
Strong experience with container orchestration platforms and cloud architecture fundamentals (networking, IAM/security concepts, and reliability patterns).
Experience with Infrastructure as Code (Terraform preferred) and automation‑first workflows.
Familiarity with GitOps practices and CI/CD pipelines.
Strong troubleshooting skills for distributed systems, including root‑cause analysis and reliability improvements.
Experience with observability concepts and practices (monitoring, logging, alerting, tracing).

Preferred Qualifications

Experience with Service Mesh technologies (Istio preferred, App Mesh or Linkerd).
Experience working with critical or mission‑critical systems.
Strong background applying SRE principles (operational readiness, incident management, runbooks, toil reduction).
AWS certifications.

Visa is an EEO Employer

Sr Site Reliability Engineer

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized