Data Architect

Tempus AI · Vertical AI · Chicago, IL +2

Data Architect to design the structural backbone of a high-scale, multi-modal healthcare data ecosystem, focusing on architecting data environments that serve AI agents and enable real-time clinical evaluation.

What you'd actually do

  1. Multi-Modal Architecture: Lead the design and management of an enterprise data model that integrates complex domains including clinical EHR records, high-throughput genomics (NGS), and cardiovascular imaging (Echo, Cath, ECG).
  2. Scale for Hospital Networks: Architect data solutions designed to scale across federated networks of hospitals, ensuring multi-tenancy, high availability, and performance across hybrid cloud environments.
  3. Enable Agentic Workflows: Design data access patterns and metadata layers specifically optimized for AI agents, allowing them to autonomously discover, query, and reason over structured and unstructured datasets.
  4. Schema & API Ownership: Author and maintain entity-relationship diagrams (ERDs), data dictionaries, and API specifications across multiple technologies (Relational, NoSQL, Vector Databases).
  5. Data Quality & Traceability: Implement automated solutions to monitor data quality and lineage with strict traceability back to source systems, ensuring "ground truth" for agentic evaluations.

Skills

Required

  • 7+ years in data architecture or enterprise modeling
  • significant experience in the healthcare or life sciences domain
  • Expert-level knowledge of 3NF, Dimensional (Star Schema), and Data Vault 2.0 modeling techniques
  • Exceptional SQL skills for complex analytical environments
  • proficiency in Python for data profiling and debugging
  • Ability to articulate the trade-offs between RDBMS, MPP, and NoSQL technologies
  • experience implementing Master Data Management (MDM) solutions
  • Deep familiarity with HL7, FHIR, and Epic/Cerner data structures
  • Proficiency with modeling tools such as Erwin, Vertabelo, or Lucidchart

Nice to have

  • Experience with Vector databases (e.g., Pinecone, Weaviate, or pgvector) or Graph databases to support RAG and agentic memory
  • Hands-on experience with GCP (BigQuery, Vertex AI) or AWS healthcare-native services
  • Direct experience working with EHR, OMOP, DICOM, genomic data models, or longitudinal patient records

What the JD emphasized

  • HIPAA-regulated environment

Other signals

  • designing data access patterns and metadata layers specifically optimized for AI agents
  • data fabric that supports real-time clinical evaluation at an enterprise scale
  • connects an entire ecosystem of real-world evidence to deliver real-time, actionable insights