Principal Data Scientist – R&d Dsdh - Preclinical Sciences & Translational Safety (psts)

Johnson & Johnson Johnson & Johnson · Pharma · Madrid, Spain +2

The Principal Data Scientist will leverage advanced machine learning and data engineering techniques to develop predictive models and analytical solutions for safety signal detection, dose selection, and translational research within pharmaceutical R&D. This role involves creating AI-ready datasets from various preclinical study types and collaborating with scientific experts to drive impactful decisions in toxicology and PK/PD.

What you'd actually do

  1. Develop and deploy ML/AI models to support safety signal detection, dose selection, PK/PD modeling, toxicology insights, and translational interpretation.
  2. Implement representation‑learning, predictive modeling, and multivariate analytics for datasets spanning in vivo studies, in vitro assays, exposure‑response data, and pathology information.
  3. Build and maintain scalable data pipelines that integrate PSTS‑relevant data sources (e.g., toxicology studies, PK/PD datasets, biomarker readouts, animal study repositories).
  4. Transform raw experimental outputs into standardized, analysis‑ready, AI‑ready datasets using Python, R, and cloud‑native services.
  5. Work directly with toxicology, DMPK, and safety stakeholders to interpret scientific context and translate study designs into computational requirements.

Skills

Required

  • Machine Learning
  • Data Engineering
  • Python
  • R
  • SQL
  • Cloud Computing
  • Workflow Orchestration
  • Version Control
  • Model Development
  • Model Evaluation
  • Model Deployment
  • Toxicology Data
  • PK/PD Data
  • In Vivo Data

Nice to have

  • Safety Sciences
  • ADME/DMPK
  • Toxicogenomics
  • Biomarker Analytics
  • Scientific Data Formats
  • Ontologies
  • Semantic Technologies
  • Knowledge Graph Integration
  • AWS S3
  • Snowflake
  • Redshift
  • Regulatory Data Standards
  • SEND
  • CDISC

What the JD emphasized

  • Advanced degree (MS or PhD) in Data Science, Computational Biology, Toxicology, Pharmacology, Biomedical Engineering, Computer Science, or related field.
  • 3+ years of experience applying machine learning and/or data engineering to scientific or biomedical datasets.
  • Proficiency with Python and/or R, SQL, and modern data engineering tooling (cloud computing, workflow orchestration, version control).
  • Experience with ML model development, evaluation, and deployment pipelines.
  • Experience working with biological, toxicology, PK/PD, or in vivo datasets.

Other signals

  • Develop and deploy ML/AI models
  • AI-ready datasets
  • predictive models
  • analytical solutions