Senior Lead AI Engineer (gen AI Platform Services)

Capital One Capital One · Banking · San Jose, CA +2

This role focuses on engineering and optimizing AI software components, particularly large language model inference and related platform services, to improve performance, scalability, cost, and latency in a production environment. It involves designing, developing, testing, deploying, and supporting these components, leveraging various AI technologies and cloud platforms.

What you'd actually do

  1. Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability, etc.
  2. Leverage a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more.
  3. Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems.
  4. Contribute to the technical vision and the long term roadmap of foundational AI systems at Capital One.

Skills

Required

  • Python
  • Go
  • Scala
  • Java
  • Computer Science
  • AI
  • Electrical Engineering
  • Computer Engineering

Nice to have

  • AWS
  • Google Cloud
  • Azure
  • private cloud
  • LLM Inference
  • Similarity Search
  • VectorDBs
  • Guardrails
  • Memory
  • optimizing training and inference software
  • Python
  • C++
  • C#
  • Java
  • Golang
  • AI research
  • AI systems
  • communication
  • presentation skills

What the JD emphasized

  • AI software components
  • foundation model training
  • large language model inference
  • similarity search
  • guardrails
  • model evaluation
  • experimentation
  • governance
  • observability
  • LLM optimization techniques
  • scalability
  • cost
  • latency
  • throughput
  • AI systems
  • Python
  • Go
  • Scala
  • Java
  • deploying scalable and responsible AI solutions
  • complex AI systems
  • LLM Inference
  • Similarity Search and VectorDBs
  • Guardrails
  • Memory
  • optimizing training and inference software

Other signals

  • foundation model training
  • large language model inference
  • model evaluation
  • governance
  • observability
  • LLM optimization techniques
  • scalability
  • cost
  • latency
  • throughput