Incident Management Engineer

Palantir Palantir · Enterprise · New York, NY · Product Support

This role focuses on ensuring the stability and reliability of Palantir's products by managing critical incidents. The Incident Management Engineer will be responsible for triaging, troubleshooting, and coordinating the resolution of issues, as well as developing tooling and processes to reduce operational overhead and prevent future incidents. The role involves a 24/7 on-call rotation and requires strong communication and problem-solving skills.

What you'd actually do

  1. Develop a deep understanding of Palantir’s product and delivery ecosystem.
  2. Collaborate with customer-facing, product, and infrastructure teams on the development and deployment of scalable, reliable software for our customers.
  3. Diagnose, resolve, and prevent issues encountered in the field.
  4. Reduce the operational overhead of responding to critical incidents at Palantir through investments in tooling, process, and automation.
  5. Take part in a 24/7 on-call rotation responsible for coordinating Palantir’s response to mission-critical incidents, ensuring efficient resolution with minimal customer impact.

Skills

Required

  • Background in Computer Science, Engineering, Information Systems, Incident Management, or other technical field.
  • Excellent problem solving skills.
  • Comfort working in a fast paced environment.
  • Ability to work both independently and make decisions under minimal direction, as well as collaborate as part of a team.

Nice to have

  • scripting
  • automation
  • data analysis

What the JD emphasized

  • critical issues immediately
  • most critical outages
  • fast-paced and high-stakes environments
  • critical incidents