Director of Production Support Engineering

Salesforce Salesforce · Enterprise · San Francisco, CA +2

Director of Production Support Engineering for a high-growth AI CRM product, focusing on scaling, reliability, and compliance for enterprise and government clients. The role involves strategic roadmap development, people leadership, managing multi-substrate deployments, overseeing legacy systems, and ensuring infrastructure meets strict regulatory requirements. It emphasizes leveraging AI tools for development workflows and integrating AI agents into human processes.

What you'd actually do

  1. Devise and champion a long-term infrastructure strategy that balances rapid AI product innovation with the rigorous demands of enterprise-grade reliability and global scale.
  2. Manage and grow a world-class team of Production Support Engineers. You will be responsible for technical mentorship, career development, and maintaining a high-performance, low-ego culture.
  3. Lead the transition to a multi-region and multi-substrate deployment model within the Salesforce commercial cloud, ensuring our "Super Agent" architecture is resilient across global geographies.
  4. Oversee the maintenance and support of our legacy environments and bespoke customer deployments, ensuring a seamless experience for long-term partners as they transition toward our modern stack.
  5. Act as the primary stakeholder for our Private Cloud (PCE) and Government Cloud initiatives, ensuring our infrastructure meets strict regulatory and data-residency requirements without sacrificing developer velocity.

Skills

Required

  • 10+ years of experience in Production Engineering, SRE, or Infrastructure roles
  • 3-5 years in a formal people management capacity
  • Kubernetes
  • Terraform
  • distributed databases (PostgreSQL)
  • operational nuances of AI/GPU workloads
  • managing production environments through a "1 to 100" scaling phase
  • multi-region or multi-cloud context
  • articulate a complex technical vision and turn it into an actionable roadmap
  • supporting "Big Iron" enterprise customers with custom deployment needs and high-compliance requirements
  • lead through major incidents and complex architectural migrations
  • leveraging AI tools to optimize team workflows
  • clear vision for how AI will transform the future of production support

Nice to have

  • AI CRM
  • agentic era
  • startup-within-Salesforce
  • Supply Chain
  • multi-environment global operation
  • modern Salesforce-native deployment
  • legacy environments
  • highly specialized custom deployments
  • global tier-one customers
  • embedded reliability organization
  • Lead and Principal Engineers (LMTS/PMTS)
  • orchestration and strategy
  • commercial Salesforce offering
  • five-nines reliability
  • Super Agent architecture
  • legacy environments and bespoke customer deployments
  • long-term partners
  • Private Cloud (PCE)
  • Government Cloud initiatives
  • data-residency requirements
  • developer velocity
  • SLIs, SLOs, and error budgets
  • incident-to-insight loop
  • Product, Engineering, and Security leadership
  • Agentforce ecosystem
  • production-grade software
  • modern engineering practices
  • AI development tools
  • secure, optimized, and high-quality code
  • design and orchestrate complex systems
  • human workflows
  • shared system context
  • system designs, constraints, and standards
  • AI to operate accurately and reliably
  • Human or AI-generated code

What the JD emphasized

  • highest levels of security and sovereignty
  • strict regulatory and data-residency requirements
  • AI as a core part of your development workflow
  • AI agents integrate seamlessly into human workflows
  • shared system context, an explicit repository of system designs, constraints, and standards that enables AI to operate accurately and reliably
  • Critically evaluate code (Human or AI-generated) for correctness, quality, security, and performance
  • understand the operational nuances of AI/GPU workloads
  • AI-Forward Mindset
  • demonstrated, genuine AI-first approach to engineering

Other signals

  • scaling Agentforce for Supply Chain from 1 to 100
  • managing the production excellence of a high-growth AI product
  • multi-region, multi-substrate powerhouse
  • Private Cloud Edition (PCE) and Government Cloud offerings
  • AI as a core part of your development workflow
  • AI agents integrate seamlessly into human workflows
  • shared system context, an explicit repository of system designs, constraints, and standards that enables AI to operate accurately and reliably
  • Critically evaluate code (Human or AI-generated)