What you'd actually do

Design, build, and operate production-grade machine learning systems that run at Visa’s global scale for NLP and related workloads with strict latency and throughput targets (e.g. 50k-100k+ tokens/sec @ 100+ RPS).

Develop end-to-end ML pipelines covering data preparation, model training, validation, deployment, monitoring, and retraining.

Build and maintain high-availability, fault-tolerant ML services and APIs, including load balancing and robust autoscaling for GPU inference.

Design and implement advanced agentic AI systems: RAG pipelines, multi-step and branching agents, actor–critic control loops, validation/guardrail stages, and custom tools.

Work closely with product, data, and platform teams to turn requirements into concrete ML system designs and production deployments across multiple Visa technology offerings.

Skills

Required

Foundational Python programming skills
Experience building and operating ML pipelines and models in production
Hands-on with PyTorch
GPU inference and optimization
Kubernetes and Docker for deploying and operating ML services
CI/CD for ML services and pipelines
Infrastructure as code with Terraform
Experience with agentic AI frameworks and patterns
Experience with Kubeflow Pipelines (KFP) or similar systems for model training workflows
Experience with at least one major cloud platform for ML (AWS, GCP, or Azure)

Nice to have

TensorFlow experience
experience with Triton Inference Server or similar serving stack
Exposure to one or more system/server programming languages is a plus (e.g., C++, Go, Rust, or Java)
Curiosity and passion for machine learning and data‑driven systems.
Comfort challenging existing solutions and learning new tools, frameworks, and platforms.
Interest in areas such as MLOps, model monitoring, feature engineering, and responsible AI.

About Us Visa is a world leader in payments technology, facilitating transactions between consumers, merchants, financial institutions and government entities across more than 200 countries and territories, dedicated to uplifting everyone, everywhere by being the best way to pay and be paid.

At Visa, you'll have the opportunity to create impact at scale — tackling meaningful challenges, growing your skills and seeing your contributions impact lives around the world.

Join Visa and do work that matters – to you, to your community, and to the world. Progress starts with you.

Job Description

The Opportunity

We are looking for talented, curious, and impact‑driven Machine Learning Engineers who enjoy solving complex problems using a combination of software engineering, data engineering, and applied machine learning.

As part of a cross‑functional product team, you will design, build, deploy, and operate ML solutions that directly support Visa’s core payment platforms and value‑added services. Your work will move beyond experimentation into real‑world production systems that must meet strict requirements for reliability, performance, security, and compliance.

The Work Itself

Design, build, and operate production‑grade machine learning systems that run at Visa’s global scale for NLP and related workloads with strict latency and throughput targets (e.g. 50k-100k+ tokens/sec @ 100+ RPS).
Develop end‑to‑end ML pipelines covering data preparation, model training, validation, deployment, monitoring, and retraining.
Build and maintain high-availability, fault-tolerant ML services and APIs, including load balancing and robust autoscaling for GPU inference.
Design and implement advanced agentic AI systems: RAG pipelines, multi-step and branching agents, actor–critic control loops, validation/guardrail stages, and custom tools.
Work closely with product, data, and platform teams to turn requirements into concrete ML system designs and production deployments across multiple Visa technology offerings.
Continuously improve model quality, data quality, system reliability, and cost/performance of the ML stack.

Essential Functions:

Own ML model and service implementations end to end, from prototype to production.
Apply MLOps practices for safe, repeatable deployment, monitoring, and lifecycle management of models and agents.
Engineer scalable APIs and serving layers that integrate cleanly with existing systems and downstream applications.
Use solid data structures, algorithms, and time/memory complexity analysis to make sound, scale-aware design choices.
Participate in technical design reviews and architecture discussions, contributing an ML and systems perspective. Debug and optimize CPU/GPU inference, data pipelines, and distributed workloads in collaboration with other engineers.

This is a hybrid position. Expectation of days in the office will be confirmed by your Hiring Manager.

Qualifications

Basic Qualifications:

Bachelor's degree, OR relevant work experience

Preferred Qualifications:

Bachelor's degree, OR 3+ years of relevant work experience
Bachelor's degree in computer science, Engineering, Data Science, or a related technical field, or equivalent practical experience.
Some practical experience as a Machine Learning Engineer, Software Engineer (ML‑focused), or Data Engineer with ML exposure.

Technical Expertise

Foundational Python programming skills, with some experience writing and maintaining production or production-adjacent code. Exposure to one or more system/server programming languages is a plus (e.g., C++, Go, Rust, or Java).
Experience building and operating ML pipelines and models in production, preferably in NLP-focused problem spaces.
Hands-on with PyTorch (Transformers, NN/MLP architectures); TensorFlow experience is a plus.
GPU inference and optimization: CUDA, ONNX; experience with Triton Inference Server or similar serving stack is a strong plus.
Kubernetes and Docker for deploying and operating ML services (namespaces, resource limits, rolling deploys).
CI/CD for ML services and pipelines.
Infrastructure as code with Terraform, including IAM and core data/compute resources.
Experience with agentic AI frameworks and patterns (e.g., Google ADK, custom toolchains, RAG orchestration).
Experience with Kubeflow Pipelines (KFP) or similar systems for model training workflows.
Experience with at least one major cloud platform for ML (AWS, GCP, or Azure), e.g., GCP Vertex AI or AWS SageMaker.
Curiosity and passion for machine learning and data‑driven systems.
Comfort challenging existing solutions and learning new tools, frameworks, and platforms.
Interest in areas such as MLOps, model monitoring, feature engineering, and responsible AI.
Problem Solving & Collaboration
Ability to approach complex problems logically and creatively.
Strong collaboration skills and willingness to learn from peers and mentors in cross‑functional teams.
Clear communication of technical concepts to both technical and non‑technical stakeholders.
Adaptability & Learning
Openness to feedback and changing priorities.

We don’t expect you to have experience with every tool or technique listed. Instead, we look for engineers with strong fundamentals, curiosity, and the ability to grow into building and owning production‑grade machine learning systems at scale.

Visa is an EEO Employer

Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.