AI Data Engineer--computational Toxicology (senior Associate)

Pfizer Pfizer · Pharma · CT

AI Data Engineer focused on Computational Toxicology at Pfizer. The role involves applying Data Engineering and MLOps best practices to automate scientific workflows, curate AI-ready datasets, and enable scalable AI solutions for drug safety science. This includes preparing, structuring, and validating data for AI-enabled toxicology assessments, integrating multi-modal datasets, and implementing scalable data pipelines.

What you'd actually do

  1. Apply Python and/or R programming to support data processing, visualization, and exploratory analyses in support of computational safety science workflows.
  2. Implement and support machine learning and data science workflows by preparing, structuring, and validating data for AI-enabled toxicology and safety assessment use cases.
  3. Design, curate, and maintain well-structured datasets and databases for chemical, biological, and toxicology data, ensuring consistency with Pfizer data standards and quality expectations.
  4. Collaborate closely with toxicologists, pathologists, bioinformaticians, and data scientists to integrate multi-modal datasets (e.g., chemical structures, in vitro and in vivo data, omics).
  5. Contribute to foundational data architecture efforts by helping implement scalable, reusable data pipelines and AI-ready data assets.

Skills

Required

  • MS in Biology, Pharmacology, Toxicology, Computer Science, Physics, Statistics, or a related technical discipline OR BS and 1+ years of experience building AI powered research applications
  • Python
  • R
  • Git
  • database creation, management, and analysis
  • understanding of data architecture principles to support AI workflows
  • foundational knowledge in biology and/or chemistry
  • communication
  • collaboration
  • problem-solving

Nice to have

  • heterogeneous datasets for basic processing, integration, and analysis
  • Shiny
  • Streamlit
  • LLM concepts
  • RAG concepts
  • foundational software engineering skills
  • clean code
  • NumPy
  • pandas
  • AI-assisted coding tools
  • modern development workflows
  • workflow tools (e.g., Nextflow)

What the JD emphasized

  • AI-enabled drug safety science
  • AI-enabled toxicology and safety assessment use cases
  • AI-ready data assets

Other signals

  • automating scientific workflows
  • curating and engineering AI-ready datasets
  • enabling scalable, reusable AI solutions