Manager, Data Engineering

Pfizer Pfizer · Pharma · Mumbai, India

Manager, Data Engineering role focused on building the data foundation, pipelines, and feature stores for AI and analytics applications, including RAG and agentic systems, within the pharmaceutical domain. The role involves advanced layer development, data strategy execution, and enabling advanced analytics and machine learning models.

What you'd actually do

  1. Leads the development of core data components for our advanced analytics layer and agentic data layers, enabling next-generation analytics and AI tools.
  2. Designs and builds end-to-end data pipelines and products specifically to power advanced AI and Retrieval-Augmented Generation (RAG) applications for the Commercial Pharma domain.
  3. Builds the clean, reliable data foundation that enables the use of statistical analysis, machine learning, and AI models like RAG to uncover patterns and insights.
  4. Stays abreast of analytical trends and cutting-edge applications of data science and AI, including RAG and agentic systems, actively applying new techniques and tools to improve data pipelines.
  5. Implements and adheres to best practices in data management, model validation, and ethical AI, maintaining high standards of quality and compliance in all developed solutions.

Skills

Required

  • Python
  • Polars
  • Pandas
  • Numpy
  • dbt
  • Airflow
  • Spark
  • Snowflake
  • Snowflake Cortex Agents
  • Data Modeling
  • SQL
  • NoSQL
  • Data Quality
  • Observability
  • Git
  • CI/CD
  • Docker
  • Project Leadership

Nice to have

  • Pharma Analytics
  • performance tuning
  • cost optimization
  • large-scale data infrastructure
  • Tableau
  • Power BI
  • Streamlit
  • Business Communication
  • Data Product Management

What the JD emphasized

  • AI and analytics applications
  • AI and machine learning models
  • data foundation
  • advanced AI and machine learning algorithms
  • agentic data layers
  • Retrieval-Augmented Generation (RAG) applications
  • agentic systems

Other signals

  • AI and analytics applications
  • AI and machine learning models
  • data foundation for high-impact projects
  • deployment of cutting-edge AI and machine learning algorithms
  • agentic data layers
  • Retrieval-Augmented Generation (RAG) applications
  • agentic systems