What you'd actually do

Architect and build core data platform capabilities that enable enterprise AI, machine learning, and Generative AI workloads, delivering scalable, governed solutions with strong performance and cost efficiency.

Build and operate scalable pipelines for structured and unstructured data ingestion for AI training and inference workloads (batch/stream), implementing data quality checks, lineage capture, and clear SLAs.

Design, implement, and harden reusable data services and APIs that support large language models (LLMs), knowledge retrieval systems, and AI-powered applications, meeting reliability and latency targets.

Design, build, and productionize Retrieval-Augmented Generation (RAG) pipelines, enabling GenAI models to access trusted enterprise data with strong latency, reliability, and evaluation coverage.

Build integrations that connect enterprise knowledge across data lakes, document stores, APIs, and enterprise systems, enabling secure retrieval and reuse in AI applications while reducing duplicated point solutions.

Skills

Required

Data engineering
AI platform engineering
Software engineering
Data architecture
Machine learning pipelines
Generative AI
Agentic AI
LLM integration
RAG implementation
Vector databases
Knowledge graphs
Semantic search
LLMOps
MLOps
AI Observability
Data governance
CI/CD
Infrastructure-as-code

Nice to have

Experience with specific LLM frameworks
Experience with cloud platforms (AWS, Azure, GCP)
Experience with real-time data processing

What the JD emphasized

enterprise technical leadership

scalable, governed data platform capabilities

AI-ready data foundations

Retrieval-Augmented Generation (RAG)

vector search

knowledge graphs

semantic layers

LLM-powered applications

enterprise AI, machine learning, and Generative AI workloads

structured and unstructured data ingestion for AI training and inference workloads

large language models (LLMs)

knowledge retrieval systems

AI-powered applications

LLM abstraction

enterprise knowledge

Retrieval-Augmented Generation (RAG) pipelines

embedding generation, vector indexing, and semantic search capabilities

AI copilots, conversational AI solutions, and agent-based AI workflows

multi-agent orchestration and tool-enabled AI systems

context assembly pipelines

context-collection strategies

context pressure

context rot

data ingestion, transformation, and serving pipelines

data models and data products

CI/CD, automated testing, and infrastructure-as-code

self-verifiable agentic feedback loops

LLMOps and MLOps frameworks

AI systems

data lineage, prompt and model evaluation, monitoring, and performance tracking

enterprise data governance practices

At Johnson & Johnson, we believe health is everything. Our strength in healthcare innovation empowers us to build a world where complex diseases are prevented, treated, and cured, where treatments are smarter and less invasive, and solutions are personal. Through our expertise in Innovative Medicine and MedTech, we are uniquely positioned to innovate across the full spectrum of healthcare solutions today to deliver the breakthroughs of tomorrow, and profoundly impact health for humanity. Learn more at jnj.com.

As guided by Our Credo, Johnson & Johnson is responsible to our employees who work with us throughout the world. We provide an inclusive work environment where each person is considered as an individual. At Johnson & Johnson, we respect the diversity and dignity of our employees and recognize their merit.

**Job Function: **

Data Analytics & Computational Sciences

**Job Sub Function: **

Data Engineering

Job Category:

Scientific/Technology

All Job Posting Locations:

Beerse, Antwerp, Belgium, Limerick, Ireland

Job Description:

We are currently recruiting for a **Principal Data Engineer – AI & Generative AI Platforms **based in **Limerick - Ireland **or Beerse - Belgium.

Role Overview

The Principal Data Engineer – AI & Generative AI Platforms provides enterprise technical leadership for the design and evolution of scalable, governed data platform capabilities that power advanced analytics, machine learning, and next-generation AI solutions (including Generative AI and Agentic AI). This role sets technical direction, establishes reference architectures and engineering standards, and drives measurable outcomes such as improved time-to-delivery, reduced platform/unit costs, and increased reuse and adoption across teams.

This role operates at the intersection of data engineering, AI platform engineering, and software engineering, driving cross-team alignment on how trusted, governed, and high-performance data ecosystems are designed and operated to support enterprise-scale AI workloads.

The successful candidate will play a key role in enabling AI-ready data foundations, supporting capabilities such as Retrieval-Augmented Generation (RAG), vector search, knowledge graphs, semantic layers, and LLM-powered applications. They will provide clear guidance on the benefits and trade-offs of these architectural components—and when to apply them—balancing cost, risk, performance, and governance requirements.

Key Responsibilities

AI & GenAI Data Platform Engineering

Architect and build core data platform capabilities that enable enterprise AI, machine learning, and Generative AI workloads, delivering scalable, governed solutions with strong performance and cost efficiency.
Build and operate scalable pipelines for structured and unstructured data ingestion for AI training and inference workloads (batch/stream), implementing data quality checks, lineage capture, and clear SLAs.
Design, implement, and harden reusable data services and APIs that support large language models (LLMs), knowledge retrieval systems, and AI-powered applications, meeting reliability and latency targets.
Implement LLM abstraction and model-routing components to maintain underlying model flexibility (“right model for the job”), including evaluation gates, fallback strategies, and operational controls.
Build integrations that connect enterprise knowledge across data lakes, document stores, APIs, and enterprise systems, enabling secure retrieval and reuse in AI applications while reducing duplicated point solutions.

Generative AI & Agentic AI Enablement

Design, build, and productionize Retrieval-Augmented Generation (RAG) pipelines, enabling GenAI models to access trusted enterprise data with strong latency, reliability, and evaluation coverage.
Build and run pipelines for embedding generation, vector indexing, and semantic search capabilities, including chunking strategies, refresh schedules, and cost-aware scaling.
Partner hands-on with product and engineering teams to ship AI copilots, conversational AI solutions, and agent-based AI workflows, integrating approved data access patterns and tooling.
Engineer infrastructure that supports multi-agent orchestration and tool-enabled AI systems. Design and implement robust context assembly pipelines (retrieval, ranking, summarization, caching) to balance quality, latency, and cost.
Design and implement streamlined context-collection strategies with telemetry and evaluation loops that maximize performance and accuracy and minimize expense.
Engineer systems that are resilient to common issues such as context pressure and context rot, using refresh/invalidation, drift detection, and re-embedding strategies.

Data Pipeline & Platform Development

Build and operate scalable data ingestion, transformation, and serving pipelines supporting analytics and AI workloads with data quality embedded.
Develop robust data models and data products enabling self-service analytics and AI development.
Implement real-time and batch pipelines supporting operational intelligence and AI-driven applications.
Apply modern engineering practices including CI/CD, automated testing, and infrastructure-as-code.
Apply self-verifiable agentic feedback loops, bridging non-determinism to compliance (as close as possible) before human-in-the-loop.

LLMOps, MLOps & AI Observability

Build and operationalize AI systems using LLMOps and MLOps frameworks, including deployment automation, versioning, and repeatable release processes.
Implement end-to-end observability for AI systems including data lineage, prompt and model evaluation, monitoring, and performance tracking.

Data Governance, Trust & Responsible AI

Implement enterprise data governance practices including metadata management, lineage, and data cataloging.
Ensure data platforms support security, privacy, and regulatory requirements.
Contribute to responsible AI practices by ensuring traceability, transparency, and auditability of data used in AI systems.

Collaboration & Delivery

Lead hands-on delivery with Data Scientists, AI Engineers, Software Engineers, and Product Teams to ship reliable data products and AI-enabled capabilities end-to-end.
Translate priority business use cases into well-scoped technical designs, drive engineering alignment through design/architecture reviews, and mentor engineers on implementation patterns and operational rigor.

Required Qualifications

Education

Bachelor’s or Master’s degree in:

Computer Science
Data Engineering
Software Engineering
Artificial Intelligence
or a related technical discipline

Experience

Strong experience in data engineering and modern cloud data platforms.
Expertise in Python, SQL, or Scala.
Experience developing scalable data pipelines and data platforms.
Experience supporting machine learning or AI development workflows.
Strong understanding of data modelling, distributed data processing, and data architecture.

AI / GenAI Experience

Experience with one or more of the following:

Retrieval-Augmented Generation (RAG) pipelines
Vector databases and semantic search
Embedding generation workflows
LLM integration patterns
AI application architectures
AI platform engineering

Preferred Technical Skills

Data Platforms

Snowflake
Databricks
Microsoft Fabric
BigQuery

Data Engineering

Apache Spark
DBT
Airflow / workflow orchestration
Delta / Iceberg / Parquet data formats

AI & GenAI Technologies

vector databases
LLM orchestration frameworks
AI application frameworks
LLM evaluation and observability tools

Cloud Platforms

AWS
Microsoft Azure
Google Cloud

Key Competencies

Strong engineering mindset and problem-solving ability
Ability to design scalable, resilient data systems
Curiosity and passion for AI innovation
Strong collaboration and communication skills

Impact of the Role

This role enables the enterprise to transform data into AI-ready assets, empowering advanced analytics, intelligent automation, and Generative AI solutions that accelerate innovation and deliver measurable business value.

**Required Skills: **

Preferred Skills:

Advanced Analytics, Agility Jumps, Coaching, Critical Thinking, Data Engineering, Data Governance, Data Modeling, Data Privacy Standards, Data Science, Digital Fluency, Execution Focus, Hybrid Clouds, Organizing, Presentation Design, Technical Development, Technical Writing, Technologically Savvy

The anticipated pay range for this position, in the primary posting location, is:

€70.100,00 - €121.210,00

The anticipated pay ranges for additional locations are:

The anticipated base pay range for this position in BELGIUM is EUR 79.800 to EUR 137.770

Benefits:

In addition to base pay, we offer the following benefits*: an annual bonus with set target (% of pay) depending on pay grade / location, where the actual amount is based on the employees’ and companies’ performance of the previous calendar year, or sales commissions. Moreover, we offer vacation days, parental leave for a minimum of 12 weeks, bereavement leave, caregiver leave, volunteer leave, well-being reimbursement, programs for financial, physical and mental health. We also offer service anniversary and recognition awards, and subject to the terms of their respective plans, employees - and in some location’s eligible dependents - can participate in several insurance plans. For more information, visit Employee benefits | Supporting well-being & career growth | Johnson & Johnson Careers.

*This is for informative purposes only. Amounts and actual benefits may vary by location and are subject to change.