Lead Agentic Data Systems Engineer

Salesforce Salesforce · Enterprise · San Francisco, CA

Lead Agentic Data Systems Engineer responsible for architecting, building, and maintaining a private ecosystem of autonomous agents for ETL, synthetic data generation, automated QA, and predictive modeling. Focuses on designing multi-step reasoning architectures and verification protocols for agent self-validation, and developing systems for agents to access data. Requires strong Python, dbt, Airflow, SQL, Spark, Snowflake, and agentic framework (LangGraph) expertise.

What you'd actually do

  1. Architect and maintain a private ecosystem of 10+ autonomous agents specialized in ETL, synthetic data generation, automated QA, and predictive modeling.
  2. Design multi-step reasoning architectures and verification protocols to ensure agents autonomously validate and peer-review their own outputs.
  3. Transform high-level, ambiguous business requirements into production-ready data products independently, bypassing the need for mid-level project management.
  4. Use domain knowledge to ensure deployed tools are well governed. Governance as code for data pipelines and Agentic development. Context aware Agent development.
  5. Develop and maintain Model Context Protocol (MCP) servers to provide agents with secure, deep-link access to Snowflake, Salesforce, AWS, and proprietary internal data catalogs.

Skills

Required

  • Python
  • dbt
  • Airflow
  • advanced SQL
  • Apache Spark
  • Snowflake
  • Prompt Engineering
  • LangGraph
  • chain-of-thought prompting
  • self-correction loops
  • iterative reasoning paths
  • Docker
  • Kubernetes
  • serverless compute environments
  • Data Mesh
  • Data-as-a-Product (DaaP)
  • Event-Driven Architectures
  • Semantic layer
  • Knowledge Graphs

Nice to have

  • Cursor
  • Codex
  • Claude Code
  • Salesforce Core
  • Data 360

What the JD emphasized

  • production-grade
  • autonomous agents
  • multi-step reasoning architectures
  • production-ready data products
  • well governed
  • Contextual Integration
  • agentic workloads
  • documented history of using generative AI to accelerate personal and departmental output by orders of magnitude
  • function as a "Domain Data Officer"
  • subtle logic errors or hallucinations in agentic output

Other signals

  • building autonomous agents
  • managing agentic systems
  • data engineering for AI