Model Behavior Tutor - Epistemic Rigor & Truthfulness

xAI xAI · AI Frontier · Remote · Post-Training

This role focuses on improving the epistemic rigor and truthfulness of AI models, specifically ensuring they reason carefully, avoid motivated reasoning, and communicate uncertainty appropriately. Responsibilities include assessing model outputs for accuracy and logical coherence, identifying fallacies, writing exemplary reasoning, and constructing adversarial examples. The role requires a strong analytical background, a track record in forecasting or rigorous analysis, and deep knowledge in fields like philosophy of science, cognitive psychology, or statistics.

What you'd actually do

  1. Assess model outputs for factual accuracy, logical coherence, fallacious reasoning, and hidden assumptions.
  2. Identify subtle ideological capture, statistical fallacies, and rhetorical sleights of hand.
  3. Write exemplary reasoning that models intellectual honesty, source evaluation, nuanced weighing of primary and secondary sources, and scoping of confidence.
  4. Construct adversarial examples and red-team prompts to expose remaining epistemic weaknesses.
  5. Contribute to the definition and scaling of constitutional principles for truth-seeking behavior.

Skills

Required

  • Philosophy of science
  • Cognitive psychology
  • Statistics
  • Logic
  • Linguistics
  • History
  • Economics
  • Forecasting
  • Analytical thinking
  • Critical thinking
  • Source evaluation
  • Nuanced weighing of information
  • Scoping of confidence

Nice to have

  • Intelligence analysis
  • Investigative journalism
  • Academic peer review

What the JD emphasized

  • Published analytical work and academic training in a high-rigor field.
  • Strong Forecasting track record (e.g., Metaculus, Good Judgment), rigorous analysis, or public updating on errors.
  • Deep knowledge in at least three of: philosophy of science, cognitive psychology, statistics, logic, linguistics, history, economics, or related disciplines.
  • Ability to steel-man opposing views and separate settled knowledge from speculation.
  • Habitual reliance on primary sources and base rates.

Other signals

  • Ensuring model reasoning is careful, resists motivated reasoning, and communicates uncertainty and evidence proportionately.
  • Assessing model outputs for factual accuracy, logical coherence, fallacious reasoning, and hidden assumptions.
  • Constructing adversarial examples and red-team prompts to expose remaining epistemic weaknesses.