Staff Machine Learning Engineer, Oncology Foundation Model

Tempus AI · Vertical AI · Chicago, IL +1 · Remote

Staff Machine Learning Engineer focused on architecting, building, and maintaining the critical data infrastructure for large multimodal generative AI models in healthcare, processing diverse data types like genomics, pathology images, and clinical notes.

What you'd actually do

  1. Architect and build sophisticated data processing workflows responsible for ingesting, processing, and preparing multimodal training data that seamlessly integrate with large-scale distributed ML training frameworks and infrastructure (GPU clusters).
  2. Develop strategies for efficient, compliant data ingestion from diverse sources, including internal databases, third-party APIs, public biomedical datasets, and Tempus's proprietary data ecosystem.
  3. Utilize, optimize, and contribute to frameworks specialized for large-scale ML data loading and streaming (e.g., MosaicML Streaming, Ray Data, HF Datasets).
  4. Collaborate closely with infrastructure and platform teams to leverage and optimize cloud-native services (primarily GCP) for performance, cost-efficiency, and security.
  5. Engineer efficient connectors and data loaders for accessing and processing information from diverse knowledge sources, such as knowledge graphs, internal structured databases, biomedical literature repositories (e.g., PubMed), and curated ontologies.

Skills

Required

  • Master's degree in Computer Science, Artificial Intelligence, Software Engineering, or a related field
  • 8+ years of industry experience in designing, building, and operating large-scale data pipelines and infrastructure in a production environment
  • Strong experience working with massive, heterogeneous datasets (TBs+) and modern distributed data processing tools and frameworks such as Apache Spark, Ray, or Dask
  • Strong, hands-on experience with tools and libraries specifically designed for large-scale ML data handling, such as Hugging Face Datasets, MosaicML Streaming, or similar frameworks (e.g., WebDataset, Petastorm)
  • Experience with MLOps tools and platforms (e.g., MLflow, Kubeflow, SageMaker Pipelines)
  • Understanding of the data challenges specific to training large models (Foundation Models, LLMs, Multimodal Models)
  • Proficiency in programming languages like Python
  • Experience with modern distributed data processing tools and frameworks
  • Proven ability to bring thought leadership to the product and engineering teams, influencing technical direction and data strategy
  • Experience mentoring junior engineers and collaborating effectively with cross-functional teams
  • Excellent communication skills
  • Strong bias-to-action and ability to thrive in a fast-paced, dynamic research and development environment
  • A pragmatic approach focused on delivering rapid, iterative, and measurable progress towards impactful goals

Nice to have

  • PhD in Computer Science, Engineering, Bioinformatics, or a related field
  • Contributions to relevant open-source projects
  • Direct experience working with clinical or biological data (EHR, genomics, medical imaging)

What the JD emphasized

  • large-scale multimodal model systems engineering
  • foundational data infrastructure
  • generative AI models
  • multimodal systems
  • genomics, pathology images, radiology scans, and clinical notes
  • Architect and build sophisticated data processing workflows
  • ingesting, processing, and preparing multimodal training data
  • large-scale distributed ML training frameworks
  • efficient, compliant data ingestion
  • large-scale ML data handling
  • training large models (Foundation Models, LLMs, Multimodal Models)
  • massive, heterogeneous datasets (TBs+)
  • modern distributed data processing tools and frameworks
  • MLOps tools and platforms

Other signals

  • large-scale multimodal model systems engineering
  • foundational data infrastructure
  • generative AI models
  • multimodal systems
  • genomics, pathology images, radiology scans, and clinical notes