AI Data Engineer--llms / Agentic Systems

Pfizer Pfizer · Pharma · MA

This role focuses on building and deploying full-stack applications that integrate LLM and AI capabilities into pharmaceutical research workflows. Responsibilities include developing backend services for data processing, embedding generation, vector search, and LLM orchestration, implementing RAG systems and agentic architectures, and creating frontend interfaces. The role also involves contributing to semantic frameworks and deploying systems on AWS.

What you'd actually do

  1. Design and implementation of production-grade full stack applications that seamlessly integrate LLM and AI capabilities into scientific workflows, enabling researchers to leverage cutting-edge artificial intelligence in their daily work
  2. Direct collaboration with medicinal chemists, biomedical researchers, and domain experts to deeply understand requirements, translate scientific challenges into technical solutions, and deliver intuitive, user-centric applications
  3. Development of scalable backend services using Python frameworks for data processing, embedding generation, vector search, and LLM orchestration that power AI-driven research tools
  4. Creation of responsive, modern frontend interfaces using React and TypeScript that provide exceptional user experiences and dramatically enhance researcher productivity
  5. Implementation of retrieval-augmented generation (RAG) systems, conversational AI interfaces, and agentic LLM architectures that automate knowledge work in pharmaceutical research

Skills

Required

  • Python
  • TypeScript
  • React
  • FastAPI
  • LLM orchestration
  • RAG
  • conversational AI interfaces
  • agentic LLM architectures
  • vector search
  • embedding generation
  • AWS

Nice to have

  • Life sciences
  • pharmaceutical research
  • drug discovery
  • cheminformatics
  • OpenAI API
  • Hugging Face Transformers
  • Anthropic Claude
  • prompt engineering
  • LLM optimization
  • MongoDB
  • PostgreSQL
  • PyTorch
  • Docker
  • CI/CD
  • GitHub Actions
  • Jenkins
  • GitLab CI

What the JD emphasized

  • production-grade full stack applications
  • production-quality software
  • GitHub portfolio required

Other signals

  • design and develop and deploy intelligent systems
  • integrate LLM and AI capabilities into scientific workflows
  • development of scalable backend services using Python frameworks for data processing, embedding generation, vector search, and LLM orchestration
  • Implementation of retrieval-augmented generation (RAG) systems, conversational AI interfaces, and agentic LLM architectures