Senior Engineer, Infrastructure Platform

Intercom Intercom · Enterprise · Dublin, Ireland +1 · R&D

Intercom is seeking a Senior Engineer for their Infrastructure Platform team. This role focuses on building and operating high-scale distributed systems and platforms in a cloud environment (AWS) to ensure reliability, observability, secure deployment, and velocity for product teams. The engineer will automate infrastructure challenges, drive architectural evolution, own service reliability, and collaborate across teams to enable product engineers.

What you'd actually do

  1. Automate everything: Eradicate manual, repetitive, and unscalable work. You will view infrastructure challenges as software problems, building automation and tooling that accelerates R&D.
  2. Drive architectural evolution: Plan, design, and execute generational changes to Intercom’s core infrastructure. You will navigate complex scaling challenges and build elasticity into our systems.
  3. Deliver operational excellence: Own the reliability of our core services. You will respond to operational events, dig deep to understand why they happened, and engineer permanent, preventative solutions.
  4. Be full stack: When there is an issue, you don’t stop at the infrastructure layer. You dive deep into our application code to trace problems out to the edge and fix customer-facing issues.
  5. Force-multiply through enablement: Abstract away shared concerns like security, compute, availability, and costs, so product engineers don't have to think about them. You'll scale your expertise through exceptional, easy-to-use agent skills, documentation, and reliable defaults.
  6. Raise the bar: Hold yourself and your teammates accountable with deep empathy. Share your domain expertise, review code quickly, unblock your peers, and foster a culture of continuous learning and support.
  7. Ship production-grade systems: Independently lead complex initiatives, owning the design, implementation, testing, and operational runbooks for high-impact infrastructure projects.

Skills

Required

  • Cloud infrastructure expertise (AWS preferred)
  • Designing, building, and operating high-scale distributed systems and platforms
  • Accountability for availability, performance, and costs
  • Modern programming languages
  • Reliability, performance, and failure modes expertise
  • Telemetry and metrics utilization
  • Developer empathy and understanding of product developer pain points
  • Proven track record of independently driving complex, cross-team technical initiatives
  • Clear technical documentation and communication skills
  • Experience with blameless incident reviews
  • Ability to present trade-offs clearly

Nice to have

  • Infrastructure-as-code

What the JD emphasized

  • high-scale distributed systems
  • reliability
  • observability
  • secure deployment
  • velocity
  • automation
  • architectural evolution
  • scaling challenges
  • operational events
  • preventative solutions
  • application code
  • customer-facing issues
  • security
  • compute
  • availability
  • costs
  • exceptional, easy-to-use agent skills
  • documentation
  • reliable defaults
  • deep empathy
  • domain expertise
  • continuous learning
  • support
  • production-grade systems
  • complex initiatives
  • design
  • implementation
  • testing
  • operational runbooks
  • high-impact infrastructure projects
  • Cloud infrastructure expertise
  • high-scale distributed systems
  • platforms
  • cloud environment
  • AWS
  • availability
  • performance
  • costs
  • Engineering rigor
  • modern programming languages
  • infrastructure-as-code
  • infrastructure as a software engineering discipline
  • Operational obsession
  • reliability
  • performance
  • failure modes
  • metrics
  • outliers
  • telemetry
  • uptime
  • resilience
  • Developer empathy
  • internal customers
  • product developers
  • friction
  • Execution and ownership
  • independently driving complex, cross-team technical initiatives
  • conception to production
  • multiple quarters
  • Clear, timely communication
  • crisp technical docs
  • blameless incident reviews
  • trade-offs
  • asynchronous
  • fast moving
  • highly collaborative environment
  • company-wide changes
  • best work of your career
  • continuous learning
  • collaborative problem-solving
  • massive scaling challenges