Data Engineer

Mistral AI Mistral AI · AI Frontier · Paris, France · Engineering & Infra

Seeking a Data Engineer with an Analytics Engineering background to build, optimize, and maintain data infrastructure. The role involves working with large datasets to support AI model training, deployment pipelines, and feature stores, as well as enabling business users with data. Requires proficiency in Python, SQL, dbt, and cloud platforms.

What you'd actually do

  1. Design, build, and maintain scalable data pipelines, ETL processes, and analytics infrastructure. Automate data quality checks and validation processes.
  2. Collaborate with cross-functional teams to understand data needs and deliver high-quality, actionable solutions, eg work closely with machine learning teams to support model training, deployment pipelines, and feature stores.
  3. Optimize data storage, retrieval, processing, and queries for performance, scalability, and cost-efficiency.
  4. Define and enforce data governance, metadata management, and data lineage standards.
  5. Ensure data integrity, security, and compliance with industry standards.

Skills

Required

  • Python
  • SQL
  • dbt
  • cloud platforms (e.g., AWS, GCP, Azure)
  • data warehousing solutions (e.g., Snowflake, BigQuery, Redshift, Clickhouse)

Nice to have

  • machine learning pipelines
  • MLOps
  • feature engineering
  • containerization and orchestration tools (e.g., Docker, Kubernetes)
  • DevOps practices
  • CI/CD pipelines
  • infrastructure-as-code (e.g., Terraform)
  • building self-service data platforms

What the JD emphasized

  • support model training, deployment pipelines, and feature stores
  • data governance, metadata management, and data lineage

Other signals

  • build, optimize, and maintain our data infrastructure
  • work with large volumes of data
  • support our science team in enhancing the quality of our state-of-the-art AI models
  • support model training, deployment pipelines, and feature stores