Distinguished Software Data Engineer

Capital One Capital One · Banking · San Francisco, CA +2

This role focuses on designing and architecting Capital One's enterprise data platforms, with a specific emphasis on creating a unified Data Product architecture to ensure AI readiness. The goal is to transform fragmented datasets into high-fidelity, machine-ready assets by modernizing the Business Data Catalog and driving data standardization practices. The engineer will build resilient, distributed systems and event-driven patterns, acting as an authoritative expert in data consumption and a visionary builder.

What you'd actually do

  1. You will define and execute a bold technical strategy for a unified Data Product architecture, engineering high-scale solutions for standardized, well-governed data creation that serves as the bedrock for enterprise-wide AI readiness. By modernizing our foundational Business Data Catalog, you will transform how the organization discovers and consumes data, turning fragmented datasets into high-fidelity, machine-ready assets.
  2. You will act as the bridge between deep engineering and business impact, to institutionalize and drive the adoption of rigorous data standardization practices, ensuring that our data consumption layer becomes a competitive engine that enables the enterprise to innovate with speed and precision.
  3. You are a visionary builder who balances high-level architecture with "hands-on-keyboard" execution, designing resilient, multi-tenant distributed systems and event-driven patterns that redefine enterprise scale.
  4. As the authoritative expert in data consumption, you don't just advise—you build the proof-of-concepts and exploratory analyses that bridge the gap between abstract tech strategy and mission-critical, production-grade solutions.
  5. Beyond the code, you are a catalyst for growth, mentoring the next generation of world-class engineers and championing a culture of inclusive excellence and inner-sourcing.

Skills

Required

  • data engineering
  • data architecture
  • AWS

Nice to have

  • data modeling
  • ontology standards
  • Python
  • SQL
  • Scala
  • deploying machine learning models
  • big data processing solutions on AWS

What the JD emphasized

  • AI readiness
  • machine-ready assets
  • AI-ready technologies

Other signals

  • enterprise-wide AI readiness
  • machine-ready assets
  • AI-ready technologies