Informatics and AI Engineer, Biotherapeutics

Pfizer Pfizer · Pharma · MA

Pfizer is seeking an Informatics and AI Engineer to build and scale an AI-ready data architecture for biotherapeutics labs. The role involves designing and implementing agentic AI solutions for data structuring, capture, and insight extraction from proprietary and external datasets, supporting drug discovery. Responsibilities include developing data platforms, ML methods, analysis pipelines, and data products, with a focus on LLMs, agentic AI, and RAG systems.

What you'd actually do

  1. Develop, support and implement a modern data platform to enable efficient and scalable correlation and analysis of data for biologic drug modalities.
  2. Develop of innovative data products and machine learning methods for biologics data together with machine learning experts within Pfizer
  3. Curate and integrate relevant datasets from the public domain
  4. Develop analysis pipelines with appropriate use of agentic AI
  5. Develop and deploy data products to meet specific needs through data integration

Skills

Required

  • PhD in Biology, Chemistry, Physics, Statistics or a related technical discipline OR Master’s degree and 2+ years of experience building AI powered research applications
  • Strong background in data handling, integration and analysis
  • Thorough understanding of drug discovery and biology with a particular focus on large molecule therapeutics such as peptides and antibodies.
  • Research experience developing data products and data integration solutions as well as a sincere desire to innovate at the nexus of data science and agentic AI for life sciences
  • Experience solving complex analyses/problems in a timely fashion
  • Experience with LLMs, agentic AI, and MCP/RAG systems
  • Exceptional programming skills in Python
  • Experience with data governance rules and data validation
  • Strong experience as a full-stack developer with focus on python and SQL, in-depth database expertise and understanding of ETL frameworks and data warehouse technologies.
  • Strong communication skills—verbal, written and presentation; demonstrated ability to communicate at the appropriate technical level with scientists having diverse areas of expertise

Nice to have

  • Proficiency in front-end technologies and browser-based visualization techniques
  • Expertise in software engineering, package development, cloud architectures, CI/CD, and software engineering tooling
  • Nextflow pipeline development
  • Strong knowledge of Linux systems including containerization technologies and HPC environments
  • Familiarity with pertinent libraries within the Python scientific stack
  • Hands-on experience handling, processing, integrating, and analyzing large heterogenous data sets data in a drug discovery research environment
  • Experience with Claude Code or equivalent and agentic coding paradigms
  • Demonstrated ability to communicate complex technical work through publications, presentations, or detailed documentation
  • Experience taking ideas from prototype to production.

What the JD emphasized

  • agentic AI
  • LLMs
  • MCP/RAG systems
  • agentic AI for life sciences
  • agentic coding paradigms

Other signals

  • design innovative software and agentic AI solutions
  • extract valuable insights from both proprietary and external datasets
  • develop analysis pipelines with appropriate use of agentic AI
  • Experience with LLMs, agentic AI, and MCP/RAG systems