AI Infrastructure Engineer

Intercom Intercom · Enterprise · Berlin, Germany +2 · AI Group

The AI Infrastructure Engineer will build and scale the systems for training and serving Intercom's AI products, focusing on model training pipelines and inference services. This role involves GPU-level performance optimization and collaboration with ML scientists to bring cutting-edge methods to production.

What you'd actually do

  1. Implement and scale training pipelines for large transformer and LLM models, from data ingestion and preprocessing through distributed training and evaluation.
  2. Build and optimize inference services that deliver low‑latency, high‑reliability experiences for our customers, including autoscaling, routing, and fallbacks.
  3. Work on GPU‑level performance: tuning kernels, improving utilization, and identifying bottlenecks across our training and inference stack.
  4. Collaborate closely with ML scientists to implement cutting edge training and inference methods and bring them to production.
  5. Play an active role in hiring, mentoring, and developing other engineers on the team.

Skills

Required

  • 5+ years of experience in software engineering
  • strong track record of shipping high-quality products or platforms
  • degree in Computer Science, Computer Engineering, or a related field (or equivalent experience)
  • production environments at meaningful scale
  • deep knowledge of at least one programming language (e.g. Python, Ruby, Java, Go)
  • write clean, reliable code and learn new stacks quickly

Nice to have

  • Experience at AI native companies that train and/or run inference for their own models
  • Experience running training or inference workloads on Kubernetes
  • Experience with AWS or other major cloud providers
  • Production experience with Python in ML or infrastructure contexts
  • Demonstrated passion for technology

What the JD emphasized

  • model training
  • model inference at scale
  • GPU-level performance

Other signals

  • building systems that train and serve AI products
  • model training at scale
  • model inference at scale
  • GPU-level performance tuning
  • implementing and scaling training pipelines
  • optimizing inference services