Staff Data Scientist: Semantic Substrate Incubation

Qualtrics Qualtrics · Seattle · Seattle, WA · Core AI

Staff Data Scientist to build the Semantic Brain, an intelligent, concept-linked graph from raw event logs. The role involves architecting an Identity-Anchored World Model for LLMs to understand enterprise concepts and drive autonomous, agentic decisions. Focus on bridging AI research and production engineering, customer translation, and technical leadership. Key responsibilities include mapping data to ontologies, designing a Concept Graph, training autonomous agents with Reward Signal Extraction, and utilizing Off-Policy Evaluation (OPE) for risk reduction. Requires deep graph expertise (AWS Neptune, SPARQL), production data pipelines (Spark), modern LLM orchestration (LangChain, LlamaIndex), and cloud architecture (AWS, Python).

What you'd actually do

  1. Map fragmented data to human-readable terms by leading the discovery and mapping of raw event logs to Vertical Ontologies (Industry Knowledge Packs).
  2. Accelerate AI accuracy by 60% by designing and deploying a Concept Graph that anchors the substrate, utilizing verified profile IDs instead of session data for memory.
  3. Train autonomous agents efficiently by building the logic for Reward Signal Extraction and Context-Aware actioning to infer KPIs directly from interaction logs, avoiding traditional delayed-reward bottlenecks.
  4. Reduce agentic action risk by 40% by utilizing Off-Policy Evaluation (OPE) and action-conditional world models to simulate high-value scenarios and ground recommendations.
  5. Avoid the "Services Trap" and enable scale by engineering automated systems that allow 80% of the team's context mapping to be executed seamlessly without manual intervention.

Skills

Required

  • Graph databases
  • AWS Neptune
  • SPARQL
  • Apache Spark (PySpark/Scala)
  • LangChain
  • LlamaIndex
  • Python
  • AWS ecosystem (EC2, Lambda, S3, CloudFormation, CDK, or Terraform)

Nice to have

  • semantic understanding
  • concept-linked graph
  • Identity-Anchored World Model
  • LLMs
  • autonomous agents
  • Reward Signal Extraction
  • Context-Aware actioning
  • Off-Policy Evaluation (OPE)
  • action-conditional world models
  • data pipelines
  • rapid prototyping

What the JD emphasized

  • zero-to-one startup mentality
  • building foundational AI products and data pipelines from scratch
  • deep applied AI research and robust production engineering
  • working directly with pilot partners and customers
  • ownership of the technical vision
  • simulation, validation, and off-policy evaluation
  • pioneer agentic AI
  • absolute bleeding edge of LLM orchestration and world modeling
  • setting industry standards for how enterprises deploy autonomous agents
  • foundational AI product roadmap
  • Proven Tracker Record in AI/ML
  • typically requires around 10+ years of professional data science experience
  • Deep Graph Expertise
  • Production-Level Data Pipelines
  • Modern LLM Orchestration
  • Cloud Architecture
  • true zero-to-one incubation team
  • Obsessed with Ground Truth
  • solving the deepest technical challenges in the enterprise AI space with academic rigor

Other signals

  • building foundational AI products from scratch
  • architecting an Identity-Anchored World Model
  • driving autonomous, agentic decisions
  • understanding complex enterprise ideas
  • LLM orchestration and world modeling