Principal Engineer, LLM

Upstart Upstart · Fintech · Remote · Machine Learning

Principal Engineer for Upstart's GenAI Core Platform team, responsible for technical leadership, roadmap definition, architecture, and adoption strategy for a new GenAI platform. The role focuses on enabling safe, efficient, and responsible LLM use across the company, identifying and solving systemic technical risks, and establishing operational standards for GenAI integrations. This includes building scalable systems for model inference, orchestration, compliance, and accelerating engineer productivity.

What you'd actually do

  1. Define and drive the multi-year technical roadmap for the Core GenAI Platform, influencing adoption and alignment across several engineering teams
  2. Collaborate with CTO/VP-level and above leadership to influence business strategy with technical insight
  3. Lead architecture and design of large-scale, mission-critical systems for model inference, orchestration, and compliance
  4. Identify and solve systemic technical risks that impact the entire company and business-critical AI initiatives
  5. Represent Upstart internally and externally as a thought leader in GenAI systems through executive reviews, cross-company forums, and industry venues

Skills

Required

  • LLM-specific infrastructure
  • inference optimization (quantization, ONNX, streaming)
  • RAG architectures
  • model lifecycle management (experimentation to production)
  • vector databases
  • automated evaluation pipelines (heuristic metrics, LLM-as-a-judge)
  • LLM toolchains (LangChain, LlamaIndex, OpenAI APIs)
  • ML/LLM platform design and operationalization
  • cost, latency, and reliability optimization
  • GenAI best practices (hallucination mitigation, fairness, explainability, data privacy)

Nice to have

  • model fine-tuning
  • alignment (RLHF/DPO)
  • full-stack engineering (Python, FastAPI, Kotlin, Spring, React/TypeScript)
  • backend systems and infrastructure (Kubernetes, Docker, Terraform, cloud-native architectures)

What the JD emphasized

  • LLM-specific infrastructure
  • inference optimization
  • RAG architectures
  • production environments
  • automated evaluation pipelines
  • ML/LLM platforms
  • GenAI best practices
  • hallucination mitigation
  • fairness
  • explainability
  • data privacy
  • model fine-tuning
  • alignment (RLHF/DPO)

Other signals

  • LLM infrastructure
  • model inference optimization
  • RAG architectures
  • production environments
  • evaluation pipelines
  • ML/LLM platforms
  • GenAI best practices