Distinguished Engineer

Capital One Capital One · Banking · Richmond, VA +1

Distinguished Engineer role focused on defining and implementing the multi-year reliability roadmap and target architectural state for the product portfolio, with a strong emphasis on leveraging Generative AI to enhance automation, code quality, and operational workflows, aiming for autonomous operations and predictive reliability within an enterprise AI context.

What you'd actually do

  1. Define and implement the multi-year reliability roadmap and target architectural state for the entire product portfolio.
  2. Invent and deliver novel engineering solutions to reduce organizational toil, maximizing engineering velocity across all services.
  3. Serve as the final escalation point for system crises, codifying best practices to build an organizational culture of outage prevention.
  4. Govern secure IaC/Platform strategy, driving enterprise-wide adoption and standardization across multi-cloud environments.
  5. Establish organization-wide reliability standards and Error Budget governance to align business objectives with engineering risk.

Skills

Required

  • Software Engineering
  • Site Reliability Engineering (SRE)
  • Solution Architecture
  • Enterprise Architecture
  • highly available system design
  • design patterns
  • Cloud computing (AWS, Microsoft Azure, Google Cloud)
  • Web technologies (Javascript, TypeScript and SPA frameworks)
  • leading and applying Generative AI/headless AI to enhance automation, code quality or operational workflow

Nice to have

  • Master's Degree in Computer Science or a related field
  • defining and executing multi-year, organization-wide IaC and CI/CD strategic roadmaps
  • optimizing hyper-scale distributed systems and advanced cloud-native architectures
  • integrating SRE models (SLOs, Error Budgets) to drive cultural and process transformation
  • designing and deploying holistic, cost-optimized observability platforms and advanced, metrics-driven SLO governance
  • cloud security, establishing comprehensive security and compliance governance (Zero Trust, Secrets Mgmt)
  • designing, scaling, and operating next-generation platforms, delivering measurable 5-10x reliability and efficiency gains
  • leveraging Generative/Headless AI to achieve autonomous operations and predictive reliability

What the JD emphasized

  • Generative AI/headless AI to enhance automation, code quality or operational workflow
  • autonomous operations and predictive reliability

Other signals

  • leading experts in their domains
  • drive innovation at multiple levels
  • architect solutions
  • thought leadership
  • engineering excellence
  • leading and applying Generative AI/headless AI to enhance automation, code quality or operational workflow
  • achieve autonomous operations and predictive reliability