Data Scientist- Clinical Decision Support- Llms

Johnson & Johnson Johnson & Johnson · Pharma · Danvers, MA +1

Seeking an experienced Data Scientist to develop next-generation clinical decision support solutions using LLMs and machine learning. Responsibilities include prompt engineering, evaluation frameworks, cloud infrastructure for LLMs, RAG optimization, and developing ML models for clinical data. The role operates within a regulated SaMD environment and requires collaboration with cross-functional teams.

What you'd actually do

  1. Develop advanced prompt engineering strategies and implement evaluation frameworks to optimize accuracy, reducing hallucinations and safety guardrails.
  2. Support design and development of secure, scalable cloud infrastructure (IaaS/PaaS) for hosting LLMs, including Azure ML Service, Azure Kubernetes Service (AKS), and Container Apps
  3. Configure and optimize data indexing for Retrieval-Augmented Generation (RAG) techniques using Azure AI Search or vector databases
  4. Support and maintain documentation for AI models, ensuring validation, transparency, explainability, and traceability for regulatory submissions.
  5. Develop and implement machine learning models for clinical decision support, translating time-series physiologic signals and clinical data into robust, actionable insights for patient management

Skills

Required

  • Master’s degree with 2+ years of experience or PhD in a relevant field.
  • Hands-on experience with LLMs/GenAI within the cloud ecosystem (Azure preferred)
  • Proficiency in Python for scripting and integrating LLM/agentic frameworks (e.g., LangChain)
  • Understanding of RAG Architecture, vector databases, embedding models, and search technologies
  • Experience with SQL databases and APIs for ETL Operations
  • Experience with machine learning model development and frameworks (PyTorch, TensorFlow, Scikit-learn)
  • Background in signal processing and biostatistics to analyze physiologic data and apply rigorous methods to model development and validation.
  • Working knowledge of bio-statistics

Nice to have

  • Knowledge of cardiovascular physiology or critical care/ICU workflows
  • Experience with clinical datasets and medical device data
  • Understanding of Software as a Medical Device (SaMD) development lifecycle and FDA regulatory expectations
  • Experience deploying models into production systems
  • Familiarity with multi-modal datatypes like medical images, sensor-based data etc.

What the JD emphasized

  • regulated Software as a Medical Device (SaMD) environment
  • regulatory submissions
  • FDA regulatory expectations

Other signals

  • LLM-driven extraction of insights from unstructured EMR sources
  • deployment of AI-enabled solutions in regulated Software as a Medical Device (SaMD) environment
  • develop advanced prompt engineering strategies and implement evaluation frameworks to optimize accuracy, reducing hallucinations and safety guardrails
  • Support design and development of secure, scalable cloud infrastructure (IaaS/PaaS) for hosting LLMs
  • Configure and optimize data indexing for Retrieval-Augmented Generation (RAG) techniques using Azure AI Search or vector databases
  • Develop and implement machine learning models for clinical decision support, translating time-series physiologic signals and clinical data into robust, actionable insights for patient management