Senior Manager, Data Engineering

SoFi SoFi · Fintech · New York, NY · Risk 2LOD

Senior Manager, Data Engineering at SoFi, a fintech company. This role is a hands-on leadership position (50% individual contributor, 50% people manager) focused on owning and managing data models, pipelines, and infrastructure for risk and AI applications. The role requires deep experience in US banking and financial risk data, including domains like lending, credit, fraud, AML, and compliance. Key responsibilities include designing and maintaining data models, architecting data pipelines using tools like dbt, Airflow, Snowflake, and Terraform, leading data and AI projects, managing a team of data engineers, and driving data quality and governance aligned with regulatory expectations. The role also involves hands-on coding and leading by example. Experience with RAG pipeline data management and evaluation dataset curation is mentioned. The role requires strong leadership, technical expertise in Snowflake, dbt, Python, Airflow, MongoDB, and Terraform, and a deep understanding of banking regulations.

What you'd actually do

  1. Own the data model - design, build, and maintain integrated data models for lending, credit, fraud, AML, KYC, and related risk domains; ensure models reflect banking semantics and regulatory requirements
  2. Own the data pipeline and infrastructure - architect and manage end-to-end data pipelines using dbt, Airflow, Snowflake, MongoDB, and Terraform; ensure reliability, performance, and scalability
  3. Lead data and AI projects - serve as the data engineering anchor for all data and AI initiatives; partner with full stack engineers, AI/ML engineers, and product managers to deliver production-grade applications; own the data infrastructure for AI use cases including RAG pipeline data management and evaluation dataset / ground truth curation
  4. Lead a team of data engineers - manage and develop the team; set standards for code quality, code review, testing, documentation, and CI/CD practices for data pipelines
  5. Drive data quality and governance - establish data definitions, lineage, and quality standards aligned to regulatory expectations (BCBS 239, SR 11-7, etc.); implement data observability practices including dbt tests, data contracts, freshness SLAs, and anomaly detection to ensure reliability across all pipelines

Skills

Required

  • 15+ years of experience in data engineering, with the majority of that tenure inside US banks or financial institutions
  • Deep knowledge of banking risk data domains - lending, credit risk, deposits, AML, KYC, fraud, banking regulations, and the critical datasets that support them
  • Expert-level Snowflake, dbt, and Python - proven ability to design and own complex analytical data models; Python fluency for Airflow DAGs, pipeline logic, and data quality scripting
  • Strong pipeline and infrastructure skills - hands-on experience with Airflow, MongoDB, and Terraform in production environments
  • People leadership - experience managing and mentoring data engineers; strong code review culture
  • Banking regulatory awareness - familiarity with BCBS 239, BSA/AML regulations, OCC/Fed/FDIC data expectations
  • Communication - able to translate complex data concepts for risk, compliance, and executive stakeholders

Nice to have

  • Experience at a US bank in Model Risk, Integrated Risk, ERM, or Compliance Analytics
  • Familiarity with GRC platforms (ServiceNow) and risk data warehouses
  • Experience with LLM/AI application data pipelines and observability tooling

What the JD emphasized

  • deep roots in US banking and financial risk data
  • spent most of their career inside US banks
  • Deep knowledge of banking risk data domains
  • Expert-level Snowflake, dbt, and Python
  • Strong pipeline and infrastructure skills
  • Banking regulatory awareness
  • familiarity with BCBS 239, BSA/AML regulations, OCC/Fed/FDIC data expectations

Other signals

  • owns the data models, data pipelines, and data infrastructure that underpin our risk and AI applications
  • own the data infrastructure for AI use cases including RAG pipeline data management and evaluation dataset / ground truth curation
  • partner with full stack engineers, AI/ML engineers, and product managers to deliver production-grade applications