Principal Research Engineer, Post-training

Character AI Character AI · AI Frontier · Redwood City, CA · Technical Staff - ML

Principal Research Engineer focused on post-training of large language models (LLMs) to create engaging and aligned AI products for Character.AI. The role involves technical leadership, research in alignment algorithms, and designing efficient training and inference systems. Requires a PhD or equivalent experience, strong ML understanding, and experience delivering production ML systems.

What you'd actually do

  1. Define and drive the technical roadmap for mid- and post-training systems, balancing research innovation with production reliability and scalability.
  2. Lead the development of alignment algorithms, optimization techniques, and training objectives to improve model capabilities and data efficiency.
  3. Lead the design of efficient training and inference systems for large-scale generative models.

Skills

Required

  • PhD in Computer Science, Machine Learning, AI, or a related field, or equivalent industry experience
  • Significant experience leading technical projects or teams in machine learning, AI research, or large-scale distributed systems
  • Experience scaling and mentoring high-performing research and engineering teams
  • Deep understanding of modern machine learning techniques, including transformers, reinforcement learning, alignment methods, and large language models
  • Strong track record of delivering impactful research or applied ML systems in production environments
  • Expertise in designing, building, and maintaining production-quality ML systems and infrastructure
  • Experience training, serving, debugging, and optimizing large-scale models on GPU-based systems
  • Experience leading teams working on large language model training, mid-training, or post-training
  • Experience with product experimentation, online evaluation, and A/B testing frameworks
  • Strong software engineering skills with the ability to write clean, maintainable, and scalable code
  • Excellent communication skills and the ability to influence technical direction across teams
  • Lead complex, cross-functional initiatives across data, training infrastructure, evaluation, and model serving

Nice to have

  • Hands-on experience working directly with open-source models like Mistral and Qwen, particularly adapting them via mid- and post-training for specific personas, creative writing, or role-playing applications
  • Familiarity with cloud-native ML infrastructure, including Kubernetes, Docker, and modern orchestration platforms
  • Publications in leading machine learning conferences or demonstrated contributions to the broader AI community

What the JD emphasized

  • PhD in Computer Science, Machine Learning, AI, or a related field, or equivalent industry experience.
  • Significant experience leading technical projects or teams in machine learning, AI research, or large-scale distributed systems.
  • Strong track record of delivering impactful research or applied ML systems in production environments.
  • Experience training, serving, debugging, and optimizing large-scale models on GPU-based systems.
  • Experience leading teams working on large language model training, mid-training, or post-training.
  • Experience with product experimentation, online evaluation, and A/B testing frameworks.

Other signals

  • post-training of top-tier OSS LLMs
  • transform foundation models into intelligent, engaging, and aligned products
  • lead initiatives spanning data, algorithms, infrastructure, and evaluation
  • improve model performance and user experience
  • shape the conversational experiences of millions of users