Model Behavior Tutor - Wit & Conversation

xAI xAI · AI Frontier · Remote · Post-Training

This role focuses on defining and safeguarding the personality and voice of an AI model (Grok), involving reviewing and scoring model responses, writing/revising sample responses, maintaining personality consistency, creating labeled training datasets for specific conversational elements, and developing evaluation tasks with engineering teams. The goal is to enhance user engagement and entertainment while ensuring factual accuracy and a distinctive, witty, culturally fluent voice.

What you'd actually do

  1. Review and score AI model responses based on criteria such as humor effectiveness, timing, engagement, and conversational naturalness (using provided rubrics).
  2. Write or revise sample responses to maximize user engagement and entertainment, while ensuring factual accuracy.
  3. Maintain consistency in Grok's personality traits across varied topics (e.g., casual chat, technical explanations, philosophical discussions, humor).
  4. Create and label training datasets focused on elements like humor, irony, banter, and cultural references.
  5. Work with engineering teams to develop evaluation tasks that test and strengthen Grok's defined personality traits.

Skills

Required

  • Advanced proficiency in English, including idioms, rhythm, tone, and nuance (native or equivalent level).
  • Public portfolio with examples of humorous, engaging, or charismatic writing (e.g., published articles, social media threads, scripts, or viral content; provide links).
  • Strong ability to identify tonal inconsistencies, poor pacing, or ineffective phrasing in text.
  • Proven track record of creating writing that is consistently humorous, clever, or engaging, with evidence of broad appeal (e.g., metrics like views, shares, likes, or publications).
  • Extensive knowledge of current internet culture (memes, trends, platforms) and historical/cultural references.

Nice to have

  • Professional experience in comedy writing, screenwriting, viral content, improvisation, or performance.

What the JD emphasized

  • Advanced proficiency in English, including idioms, rhythm, tone, and nuance (native or equivalent level).
  • Proven track record of creating writing that is consistently humorous, clever, or engaging, with evidence of broad appeal (e.g., metrics like views, shares, likes, or publications).

Other signals

  • AI model responses
  • training datasets
  • evaluation tasks