Production Support Engineering Lmts

Salesforce · Enterprise · San Francisco, CA +2

Salesforce is seeking a Production Support Engineer (LMTS) to join their embedded reliability team for the Agentforce for Supply Chain platform. This role focuses on production excellence, performance tuning, and infrastructure automation to scale the platform for global demand and enterprise-grade resilience. The engineer will partner with PMTS-level engineers, contribute to infrastructure strategy, maintain automated environments, support AI/ML infrastructure, harden the observability stack, optimize performance, and leverage AI tools for operational tasks. The role requires strong experience in SRE/Production Engineering, Kubernetes, Terraform, cloud platforms, and coding in Golang, TypeScript, or Python, with a deep understanding of distributed systems and AI agents. Advanced prompt engineering skills and an AI-first approach to engineering are essential.

What you'd actually do

Own the reliability roadmap for major product areas, working to transition our systems from startup-speed architectures to highly-available, global-scale enterprise solutions.
Partner with PMTS-level engineers to refine our infrastructure strategy, contributing senior-level perspectives on system design, capacity planning, and bottleneck identification.
Maintain and evolve our automated environments, focusing on making our "infrastructure-as-plugins" model more robust and developer-friendly.
Support the scaling of our AI/ML infrastructure, ensuring our models have the GPU resources and data pipelines required to deliver real-time supply chain insights.
Lead the "1 to 100" hardening of our observability stack. You won’t just respond to incidents; you’ll build the tooling that prevents them and the telemetry that explains them.

Skills

Required

5+ years of experience in SRE, Production Engineering, or Backend Engineering with a heavy focus on operations and scale
Proven Scaling Experience: You have previously helped take a product through a high-growth phase (the "1 to 100" journey), dealing with the technical debt and architectural shifts that come with it.
Technical Breadth: Strong proficiency in Kubernetes, Terraform/OpenTofu, and AWS/GCP/Azure.
Coding Mastery: Ability to write and review production-level code in Golang, TypeScript, or Python—you view automation as a software engineering problem.
Systems Expert: Deep understanding of distributed systems, including how to debug complex interactions between microservices, databases, and AI agents.
Low-Ego Collaboration: Experience working within a senior team of Principal engineers, capable of both leading specific initiatives and supporting the broader group’s technical vision.
A demonstrated, genuine AI-first approach to engineering. Using AI to move faster, build fluency across the stack, and contribute well beyond your core specialty.
Experience using AI tools (e.g., Claude Code, GitHub Copilot, Codex, Cursor, etc.) in development workflows
Advanced prompt engineering skills and the ability to write precise, structured prompts and cultivate the system context that makes AI outputs reliable, secure, and production-ready.

Nice to have

M.S. in Computer Science or equivalent practical experience.
Database Specialist: Strong experience with PostgreSQL at scale (partitioning, indexing, query tuning).
Distributed Systems: Advanced knowledge of microservice orchestration and durability patterns, including hands-on experience with Temporal for workflow reliability and service mesh for secure, observable service-to-service communication in high-growth SaaS environments.
Supply Chain/Logistics: Experience with the unique data constraints and reliability requirements of manufacturing or global logistics.
Salesforce Knowledge: Familiarity with

What the JD emphasized

scaling
production excellence
performance tuning
infrastructure automation
agentic capabilities
Scaling & Reliability
highly-available, global-scale enterprise solutions
infrastructure strategy
system design
capacity planning
bottleneck identification
Infrastructure as Code
automated environments
infrastructure-as-plugins
AI Operations (AIOps)
scaling of our AI/ML infrastructure
GPU resources
data pipelines
real-time supply chain insights
Production Excellence
hardening of our observability stack
build the tooling that prevents them
telemetry that explains them
Performance Engineering
SQL optimization
API latency
cross-service communication
data-intensive supply chain platform
performant under heavy load
AI-First Workflow
using AI tools
automate routine operational tasks
accelerate infrastructure delivery
building and maintaining the shared system context
system designs
constraints
standards
AI to operate accurately and reliably
Critically evaluate code (Human or AI-generated)
correctness
quality
security
performance
5+ years of experience in SRE, Production Engineering, or Backend Engineering with a heavy focus on operations and scale
Proven Scaling Experience
high-growth phase
technical debt
architectural shifts
Technical Breadth
Kubernetes
Terraform/OpenTofu
AWS/GCP/Azure
Coding Mastery
production-level code
Golang
TypeScript
Python
automation as a software engineering problem
Systems Expert
Deep understanding of distributed systems
debug complex interactions
microservices
databases
AI agents
Low-Ego Collaboration
Principal engineers
leading specific initiatives
supporting the broader group’s technical vision
demonstrated, genuine AI-first approach to engineering
Using AI to move faster
build fluency across the stack
contribute well beyond your core specialty
Experience using AI tools
Claude Code
GitHub Copilot
Codex
Cursor
development workflows
Advanced prompt engineering skills
write precise, structured prompts
cultivate the system context
AI outputs reliable, secure, and production-ready

Other signals

Scaling architecture to handle global demand
Hardening systems for enterprise-grade resilience
Integrating deeply with the Agentforce ecosystem
Production excellence, performance tuning, and infrastructure automation
Scaling our AI/ML infrastructure
GPU resources and data pipelines required to deliver real-time supply chain insights
Hardening of our observability stack
Building the tooling that prevents incidents and the telemetry that explains them
Deep-dive into SQL optimization, API latency, and cross-service communication
Data-intensive supply chain platform remains performant under heavy load
Using AI tools to automate routine operational tasks and accelerate infrastructure delivery
Building and maintaining the shared system context
Critically evaluate code (Human or AI-generated) for correctness, quality, security, and performance
5+ years of experience in SRE, Production Engineering, or Backend Engineering with a heavy focus on operations and scale
Proven Scaling Experience
Technical Breadth: Strong proficiency in Kubernetes, Terraform/OpenTofu, and AWS/GCP/Azure
Coding Mastery: Ability to write and review production-level code in Golang, TypeScript, or Python
Systems Expert: Deep understanding of distributed systems, including how to debug complex interactions between microservices, databases, and AI agents
Low-Ego Collaboration
A demonstrated, genuine AI-first approach to engineering
Experience using AI tools (e.g., Claude Code, GitHub Copilot, Codex, Cursor, etc.) in development workflows
Advanced prompt engineering skills and the ability to write precise, structured prompts and cultivate the system context that makes AI outputs reliable, secure, and production-ready

Read full job description

To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.

Job Category

Software Engineering

Job Details

About Salesforce

Salesforce is the #1 AI CRM, where humans with agents drive customer success together. Here, ambition meets action. Tech meets trust. And innovation isn’t a buzzword — it’s a way of life. The world of work as we know it is changing and we're looking for Trailblazers who are passionate about bettering business and the world through AI, driving innovation, and keeping Salesforce's core values at the heart of it all.

Ready to level-up your career at the company leading workforce transformation in the agentic era? You’re in the right place! Agentforce is the future of AI, and you are the future of Salesforce.

Opportunity & Product

Join an agile team with deep startup roots. We operate as a high-velocity ‘startup-within-Salesforce,’ following our recent acquisition. You’ll be managed by the same founders and engineers who built the original company, offering the autonomy of a small team backed by the global scale and trust of Salesforce.

We have successfully moved past the "0 to 1" phase. We have a product that works, customers who love it, and the backing of Salesforce. Now, we are entering the "1 to 100" phase: scaling our architecture to handle global demand, hardening our systems for enterprise-grade resilience, and integrating deeply with the Agentforce ecosystem. This is your chance to help lead that transition.

What You’ll Do

As a Production Support Engineer (LMTS), you will be a senior technical lead within our embedded reliability team. You aren’t building the foundation alone—you’ll work alongside a group of engineers and product owners to ensure the Agentforce for Supply Chain platform is the most reliable AI-powered engine in the industry.

This is a role for an engineer who loves the "scaling" problem. You will focus on production excellence, performance tuning, and infrastructure automation. Because you are embedded in the product organization, you’ll have a seat at the table during design reviews, ensuring that as we add new agentic capabilities, they are built to scale from day one.

Responsibilities

Scaling & Reliability: Own the reliability roadmap for major product areas, working to transition our systems from startup-speed architectures to highly-available, global-scale enterprise solutions.
Collaborative Leadership: Partner with PMTS-level engineers to refine our infrastructure strategy, contributing senior-level perspectives on system design, capacity planning, and bottleneck identification.
Infrastructure as Code: Maintain and evolve our automated environments, focusing on making our "infrastructure-as-plugins" model more robust and developer-friendly.
AI Operations (AIOps): Support the scaling of our AI/ML infrastructure, ensuring our models have the GPU resources and data pipelines required to deliver real-time supply chain insights.
Production Excellence: Lead the "1 to 100" hardening of our observability stack. You won’t just respond to incidents; you’ll build the tooling that prevents them and the telemetry that explains them.
Performance Engineering: Deep-dive into SQL optimization, API latency, and cross-service communication to ensure our data-intensive supply chain platform remains performant under heavy load.
AI-First Workflow: Lean into the future of engineering by using AI tools (Claude Code, etc.) to automate routine operational tasks and accelerate infrastructure delivery.
Contribute to building and maintaining the shared system context, an explicit repository of system designs, constraints, and standards that enables AI to operate accurately and reliably.
Critically evaluate code (Human or AI-generated) for correctness, quality, security, and performance

Required Qualifications

5+ years of experience in SRE, Production Engineering, or Backend Engineering with a heavy focus on operations and scale.
Proven Scaling Experience: You have previously helped take a product through a high-growth phase (the "1 to 100" journey), dealing with the technical debt and architectural shifts that come with it.
Technical Breadth: Strong proficiency in Kubernetes, Terraform/OpenTofu, and AWS/GCP/Azure.
Coding Mastery: Ability to write and review production-level code in Golang, TypeScript, or Python—you view automation as a software engineering problem.
Systems Expert: Deep understanding of distributed systems, including how to debug complex interactions between microservices, databases, and AI agents.
Low-Ego Collaboration: Experience working within a senior team of Principal engineers, capable of both leading specific initiatives and supporting the broader group’s technical vision.
A demonstrated, genuine AI-first approach to engineering. Using AI to move faster, build fluency across the stack, and contribute well beyond your core specialty.
Experience using AI tools (e.g., Claude Code, GitHub Copilot, Codex, Cursor, etc.) in development workflows
Advanced prompt engineering skills and the ability to write precise, structured prompts and cultivate the system context that makes AI outputs reliable, secure, and production-ready.

Preferred Qualifications

M.S. in Computer Science or equivalent practical experience.
Database Specialist: Strong experience with PostgreSQL at scale (partitioning, indexing, query tuning).
Distributed Systems: Advanced knowledge of microservice orchestration and durability patterns, including hands-on experience with Temporal for workflow reliability and service mesh for secure, observable service-to-service communication in high-growth SaaS environments.
Supply Chain/Logistics: Experience with the unique data constraints and reliability requirements of manufacturing or global logistics.
Salesforce Knowledge: Familiarity with Salesforce infrastructure, Hyperforce, or Data Cloud is a plus.
Public Cloud Expertise: Deep knowledge of networking, security, and identity management within major cloud providers.

Unleash Your Potential

When you join Salesforce, you’ll be limitless in all areas of your life. Our benefits and resources support you to find balance and be your best, and our AI agents accelerate your impact so you can do your best. Together, we’ll bring the power of Agentforce to organizations of all sizes and deliver amazing experiences that customers love. Apply today to not only shape the future — but to redefine what’s possible — for yourself, for AI, and the world.

Accommodations

If you need a reasonable accommodation during the application or the recruiting process, please submit a request via this Accommodations Request Form.

Please note that Salesforce uses artificial intelligence (AI) tools to help our recruiters assess and evaluate candidates’ resumes and qualifications throughout the recruiting process. Humans will always make any candidate selection and hiring decisions. Please see our Candidate Privacy Statement for more information about how we use your personal data and your rights, including with regard to use of AI tools and opt out options.

Posting Statement

Salesforce is an equal opportunity employer and maintains a policy of non-discrimination with all employees and applicants for employment. What does that mean exactly? It means that at Salesforce, we believe in equality for all. And we believe we can lead the path to equality in part by creating a workplace that’s inclusive, and free from discrimination. Know your rights: workplace discrimination is illegal. Any employee or potential employee will be assessed on the basis of merit, competence and qualifications – without regard to race, religion, color, national origin, sex, sexual orientation, gender expression or identity, transgender status, age, disability, veteran or marital status, political viewpoint, or other classifications protected by law. This policy applies to current and prospective employees, no matter where they are in their Salesforce employment journey. It also applies to recruiting, hiring, job assignment, compensation, promotion, benefits, training, assessment of job performance, discipline, termination, and everything in between. Recruiting, hiring, and promotion decisions at Salesforce are fair and based on merit. The same goes for compensation, benefits, promotions, transfers, reduction in workforce, recall, training, and education.

In the United States, compensation offered will be determined by factors such as location, job level, job-related knowledge, skills, and experience. Certain roles may be eligible for incentive compensation, equity, and benefits. Salesforce offers a variety of benefits to help you live well including: time off programs, medical, dental, vision, mental health support, paid parental leave, life and disability insurance, 401(k), and an employee stock purchasing program. More details about company benefits can be found at the following link: https://www.salesforcebenefits.com.Pursuant to the San Francisco Fair Chance Ordinance and the Los Angeles Fair Chance Initiative for Hiring, Salesforce will consider for employment qualified applicants with arrest and conviction records.

At Salesforce, we believe in equitable compensation practices that reflect the dynamic nature of labor markets across various regions. The typical base salary range for this position is $172,500 - $260,100 annually. In select cities within the San Francisco and New York City metropolitan area, the base salary range for this role is $207,800 - $285,800 annually. The range represents base salary only, and does not include company bonus, incentive for sales roles, equity or benefits, as applicable.