What you'd actually do

Develop ML/AI models that support discovery workflows, including target prioritization, multi‑omics integration, and mechanistic inference.

Apply modern ML approaches (e.g., deep learning, graph learning, foundation models, generative models) to chemical, biological, imaging, and assay datasets.

Build and optimize models for real‑world R&D use cases, ensuring scalability, interpretability, and scientific rigor.

Design, build, and maintain robust data pipelines that curate, standardize, and integrate diverse R&D datasets (chemical, biological, multi‑omics, imaging, biophysical, automation logs, etc.).

Partner with platform teams to implement best‑practice MLOps/DevOps workflows and deploy ML models into production R&D environments

Skills

Required

Python
PyTorch
TensorFlow
scikit-learn
RDKit
Data Engineering
MLOps
DevOps
Cloud Computing (AWS, GCP, or Azure)

Nice to have

Ph.D.
Computational Biology
Bioinformatics
Data Science
Chemistry
Chemical Biology
Biomedical Engineering
Computer Science
Drug Discovery
Biology
Systems Biology
Imaging
Pharma or Biotech Discovery
Target Assessment
Phenotypic Screening
Medicinal Chemistry Workflows
Lab Automation
Omics
High-content Imaging
Chemical Structure Data
Biological Assay Data
FAIR Data Standards
Ontologies
Controlled Vocabularies
Regulated Environments
Quality-governed Environments

What the JD emphasized

Master’s or Ph.D. in Computational Biology, Bioinformatics, Data Science, Chemistry, Chemical Biology, Biomedical Engineering, Computer Science, or related field.

Experience applying ML/AI in scientific domains (drug discovery, biology, chemistry, systems biology, imaging, or related areas).

Strong programming skills in Python (preferred) and experience with scientific/ML libraries (PyTorch, TensorFlow, scikit‑learn, RDKit, etc.).

Practical experience with data engineering, including data modeling, workflow orchestration, ETL/ELT pipelines, and cloud computing environments (AWS, GCP, or Azure).

Other signals

Develop ML/AI models that support discovery workflows, including target prioritization, multi‑omics integration, and mechanistic inference.

Apply modern ML approaches (e.g., deep learning, graph learning, foundation models, generative models) to chemical, biological, imaging, and assay datasets.

Build and optimize models for real‑world R&D use cases, ensuring scalability, interpretability, and scientific rigor.

Design, build, and maintain robust data pipelines that curate, standardize, and integrate diverse R&D datasets (chemical, biological, multi‑omics, imaging, biophysical, automation logs, etc.).

Partner with platform teams to implement best‑practice MLOps/DevOps workflows and deploy ML models into production R&D environments

At Johnson & Johnson, we believe health is everything. Our strength in healthcare innovation empowers us to build a world where complex diseases are prevented, treated, and cured, where treatments are smarter and less invasive, and solutions are personal. Through our expertise in Innovative Medicine and MedTech, we are uniquely positioned to innovate across the full spectrum of healthcare solutions today to deliver the breakthroughs of tomorrow, and profoundly impact health for humanity. Learn more at jnj.com

As guided by Our Credo, Johnson & Johnson is responsible to our employees who work with us throughout the world. We provide an inclusive work environment where each person is considered as an individual. At Johnson & Johnson, we respect the diversity and dignity of our employees and recognize their merit.

**Job Function: **

Data Analytics & Computational Sciences

**Job Sub Function: **

Data Science

Job Category:

Scientific/Technology

All Job Posting Locations:

Cambridge, Massachusetts, United States of America, La Jolla, California, United States of America, Spring House, Pennsylvania, United States of America

Job Description:

Johnson & Johnson Innovative Medicine is recruiting for Principal Data Scientist – R&D DSDH - Therapeutics Discovery (TD)

The primary location for this position is open to Spring House, PA; Titusville, NJ; Spring House, PA; Cambridge, MA; San Diego, CA; Beerse, Belgium; Madrid, Spain; or Barcelona, Spain.

Candidate Interested in our EMEA based locations, please apply to R-069202

About the Role

Johnson & Johnson Innovative Medicine is seeking a highly skilled R&D Data Scientist to support our Therapeutics Discovery (TD) organization. This role sits within the R&D Data Science group and will focus on building and applying advanced Machine Learning (ML) and Data Engineering solutions that accelerate scientific innovation across the drug discovery lifecycle. The ideal candidate brings strong computational expertise and a solid scientific understanding of early R&D, including areas such as Target Identification & Assessment, Lead Identification & Optimization, Mechanistic / Mode of Action studies, and Lab Automation & high‑throughput experimentation.

The Data Scientist will collaborate closely with discovery scientists, automation engineers, computational biologists, and platform technology teams to transform complex, multimodal R&D data into actionable insights that drive therapeutic innovation.

Key Responsibilities

Machine Learning & Modeling

Develop ML/AI models that support discovery workflows, including target prioritization, multi‑omics integration, and mechanistic inference.
Apply modern ML approaches (e.g., deep learning, graph learning, foundation models, generative models) to chemical, biological, imaging, and assay datasets.
Build and optimize models for real‑world R&D use cases, ensuring scalability, interpretability, and scientific rigor.

Data Engineering & Pipeline Development

Design, build, and maintain robust data pipelines that curate, standardize, and integrate diverse R&D datasets (chemical, biological, multi‑omics, imaging, biophysical, automation logs, etc.).
Partner with platform teams to implement best‑practice MLOps/DevOps workflows and deploy ML models into production R&D environments
Develop tooling that accelerates dataset preparation, feature engineering, and model lifecycle management across TD.

Scientific Partnership

Work hand‑in‑hand with TD scientists to understand key biological and chemical questions and shape computational strategy accordingly.
Translate sparse, heterogeneous experimental datasets into insights that guide decision‑making in hit discovery, mechanism studies, perturbation experiments, and compound optimization.
Participate in design, interpretation, and iterative refinement of discovery experiments.

Innovation & Collaboration

Partner with cross-functional teams in R&D Data Science, IT, platform engineering, and therapeutic area groups to drive AI/ML adoption.
Contribute to evaluating new analytical methods, automation technologies, and data platforms supporting next‑generation discovery science.
Champion high standards for data quality, documentation, governance, and reproducibility.

Qualifications

Required

Master’s or Ph.D. in Computational Biology, Bioinformatics, Data Science, Chemistry, Chemical Biology, Biomedical Engineering, Computer Science, or related field.
Experience applying ML/AI in scientific domains (drug discovery, biology, chemistry, systems biology, imaging, or related areas).
Strong programming skills in Python (preferred) and experience with scientific/ML libraries (PyTorch, TensorFlow, scikit‑learn, RDKit, etc.).
Practical experience with data engineering, including data modeling, workflow orchestration, ETL/ELT pipelines, and cloud computing environments (AWS, GCP, or Azure).
Ability to work directly with experimental scientists to solve real R&D challenges.

Preferred

Experience in pharma or biotech discovery, including target assessment, phenotypic screening, medicinal chemistry workflows, and lab automation.
Familiarity with omics, high‑content imaging, chemical structure data, or biological assay data.
Knowledge of data standards (e.g., FAIR, ontologies, controlled vocabularies) and working within regulated or quality‑governed environments.
Strong communication skills and ability to thrive in a matrixed, multidisciplinary environment.

Why This Role Is Unique

This is a rare opportunity to grow in one of the world’s most ambitious and fastest-growing Pharma R&D Data Science organizations, shaping how TD data powers next‑generation therapies in the largest biomedical company on the planet. Your work will directly accelerate Johnson & Johnson’s scientific discovery, fuel AI innovation, and impact patients globally.

Johnson & Johnson is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, age, national origin, disability, protected veteran status or other characteristics protected by federal, state or local law. We actively seek qualified candidates who are protected veterans and individuals with disabilities as defined under VEVRAA and Section 503 of the Rehabilitation Act.

Johnson & Johnson is committed to providing an interview process that is inclusive of our applicants’ needs. If you are an individual with a disability and would like to request an accommodation, external applicants please contact us via _https://www.jnj.com/contact-us/careers_, internal employees contact AskGS to be directed to your accommodation resource.

#JRDDS

#JNJDataScience

JNJIMRND-DS

Required Skills:

Preferred Skills:

Advanced Analytics, Coaching, Critical Thinking, Data Analysis, Data Privacy Standards, Data Quality, Data Reporting, Data Savvy, Data Science, Data Visualization, Digital Fluency, Econometric Models, Organizing, Process Improvements, Strategic Thinking, Technical Credibility, Workflow Analysis

The anticipated base pay range for this position is :

$117,000.00 - $201,250.00

Additional Description for Pay Transparency:

Subject to the terms of their respective policies and date of hire, employees are eligible for the following time off benefits:

Vacation –120 hours per calendar year

Sick time - 40 hours per calendar year; for employees who reside in the State of Colorado –48 hours per calendar year; for employees who reside in the State of Washington –56 hours per calendar year

Holiday pay, including Floating Holidays –13 days per calendar year

Work, Personal and Family Time - up to 40 hours per calendar year

Parental Leave – 480 hours within one year of the birth/adoption/foster care of a child

Bereavement Leave – 240 hours for an immediate family member: 40 hours for an extended family member per calendar year

Caregiver Leave – 80 hours in a 52-week rolling period10 days

Volunteer Leave – 32 hours per calendar year

Military Spouse Time-Off – 80 hours per calendar year

For additional general information on Company benefits, please go to: - https://www.careers.jnj.com/employee-benefits