Principal Data Scientist – R&d Dsdh - Preclinical Sciences & Translational Safety (psts)

Johnson & Johnson Johnson & Johnson · Pharma · Spring House, PA +5

The Principal Data Scientist will leverage advanced machine learning and data engineering techniques to create AI-ready datasets, develop predictive models, and deliver analytical solutions for safety evaluations and translational research within the Pharmaceutical Sciences & Translational Safety (PSTS) organization. This role involves working closely with toxicologists, PK/PD specialists, and researchers to improve safety evaluations and facilitate translational research.

What you'd actually do

  1. Develop and deploy ML/AI models to support safety signal detection, dose selection, PK/PD modeling, toxicology insights, and translational interpretation.
  2. Implement representation‑learning, predictive modeling, and multivariate analytics for datasets spanning in vivo studies, in vitro assays, exposure‑response data, and pathology information.
  3. Build and maintain scalable data pipelines that integrate PSTS‑relevant data sources (e.g., toxicology studies, PK/PD datasets, biomarker readouts, animal study repositories).
  4. Transform raw experimental outputs into standardized, analysis‑ready, AI‑ready datasets using Python, R, and cloud‑native services.
  5. Work directly with toxicology, DMPK, and safety stakeholders to interpret scientific context and translate study designs into computational requirements.

Skills

Required

  • Machine learning
  • Data engineering
  • Python
  • R
  • SQL
  • Cloud computing
  • Workflow orchestration
  • Version control
  • ML model development
  • ML model evaluation
  • ML model deployment
  • Biological datasets
  • Toxicology datasets
  • PK/PD datasets
  • In vivo datasets

Nice to have

  • Safety sciences
  • ADME/DMPK
  • Toxicogenomics
  • Biomarker analytics
  • Scientific data formats
  • Ontologies
  • Semantic technologies
  • Knowledge graph integration
  • AWS S3
  • Snowflake
  • Redshift

What the JD emphasized

  • Advanced degree (MS or PhD) in Data Science, Computational Biology, Toxicology, Pharmacology, Biomedical Engineering, Computer Science, or related field.
  • 3+ years of experience applying machine learning and/or data engineering to scientific or biomedical datasets.
  • Proficiency with Python and/or R, SQL, and modern data engineering tooling (cloud computing, workflow orchestration, version control).
  • Experience with ML model development, evaluation, and deployment pipelines.
  • Experience working with biological, toxicology, PK/PD, or in vivo datasets.

Other signals

  • Develop and deploy ML/AI models
  • Build and maintain scalable data pipelines
  • Transform raw experimental outputs into standardized, analysis-ready, AI-ready datasets