What you'd actually do

Continuously invest in documentation, metrics, monitors and other troubleshooting tools

Participate in on-call rotations during business hours and occasional weekends. This is a challenging yet rewarding opportunity to help remediate the most pressing issues across the Palantir fleet.

Diagnose, resolve, and prevent issues encountered in the field. Deliver end-to-end improvements to core products based on these issues you encounter in the field.

Improve observability by refactoring codepaths and introducing telemetry

Identify and implement data-driven opportunities for improved service resilience

Develop strategic opinions on stability investments and inform the vision for long-term product stability

Skills

Required

Engineering background in Computer Science, Mathematics, Software Engineering, Physics or similar field
Experience producing code in backend languages such as Java, as part of a past role or personal projects
Familiarity with storage and data processing systems and cloud infrastructure
Strong written and verbal communication and ability to iterate quickly with teammates and incorporate feedback
Eligibility and willingness to obtain a US Security clearance

Nice to have

Comfortable with and curious about large scale production systems and technologies. For example, load balancing, monitoring, distributed systems, and configuration management.
Confidence in troubleshooting complex issues independently using observability tools and stack traces
Familiarity with monitoring tools such as Prometheus and health checks
Experience coding with Java, Go and/or web technologies (e.g. HTML, CSS, JavaScript, Python/Ruby, Django/Flask/Ruby on Rails, etc.) is a plus
Track record of identifying bugs in codebases and contributing fixes leading to long term service stability
Demonstrated ability making data-driven decisions and engaging with stakeholders on strategy

A World-Changing Company

Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.

The Role

Product Reliability Engineers (PREs) are responsible for the health, performance, and stability of the services that power services at Palantir. PREs take ownership over the entire end-to-end cycle of service reliability, from responding to outages to improving codebases and building lasting solutions.

You will tackle critical issues for key customers, introduce observability into complex systems, address tech debt in essential codebases, and inform strategic investments in core products. We are looking for engineers who enjoy deep-dive troubleshooting, feel strong ownership over the problems they encounter, and recognize the urgency of customer-facing outages.

PREs spend the majority of their time on forward-looking product work, including but not limited to, infrastructure migrations, product contributions to improve stability and observability, and codebase enhancements that increase resilience. During periodic on-call shifts, we respond to automated alerts, investigate issues reported by customers, and share technical expertise with adjacent product teams.

Whatever the technical issue or question about your service is, you'll play a central and critical role in resolving it, seeking not just a one-time fix, but a permanent solution. We provide new team members with an experienced mentor and a clear onboarding framework to set them up for success in the role.

Core Responsibilities

Continuously invest in documentation, metrics, monitors and other troubleshooting tools
Participate in on-call rotations during business hours and occasional weekends. This is a challenging yet rewarding opportunity to help remediate the most pressing issues across the Palantir fleet.
Diagnose, resolve, and prevent issues encountered in the field. Deliver end-to-end improvements to core products based on these issues you encounter in the field.
Improve observability by refactoring codepaths and introducing telemetry
Identify and implement data-driven opportunities for improved service resilience
Develop strategic opinions on stability investments and inform the vision for long-term product stability

What We Value

Comfortable with and curious about large scale production systems and technologies. For example, load balancing, monitoring, distributed systems, and configuration management.
Confidence in troubleshooting complex issues independently using observability tools and stack traces
Familiarity with monitoring tools such as Prometheus and health checks
Experience coding with Java, Go and/or web technologies (e.g. HTML, CSS, JavaScript, Python/Ruby, Django/Flask/Ruby on Rails, etc.) is a plus
Track record of identifying bugs in codebases and contributing fixes leading to long term service stability
Demonstrated ability making data-driven decisions and engaging with stakeholders on strategy

What We Require

Engineering background in Computer Science, Mathematics, Software Engineering, Physics or similar field
Ability to work with a high degree of ownership and a strong sense of urgency in a dynamic environment
Experience producing code in backend languages such as Java, as part of a past role or personal projects
Familiarity with storage and data processing systems and cloud infrastructure
Strong written and verbal communication and ability to iterate quickly with teammates and incorporate feedback
Eligibility and willingness to obtain a US Security clearance

Salary

The estimated salary range for this position is estimated to be $96,000 - $140,000/year. Total compensation for this position may also include Restricted Stock units, sign-on bonus and other potential future incentives. Further note that total compensation for this position will be determined by each individual's relevant qualifications, work experience, skills, and other factors. This estimate excludes the value of any potential sign-on bonus; the value of any benefits offered; and the potential future value of any long-term incentives.

Our benefits aim to promote health and wellbeing across all areas of Palantirians’ lives. We work to continuously improve our offerings and listen to our community as we design and update them. The list below details our available benefits and some of the perks that can be enjoyed as an employee of Palantir Technologies.

Benefits

• Employees (and their eligible dependents) can enroll in medical, dental, and vision insurance as well as voluntary life insurance

• Employees are automatically covered by Palantir’s basic life, AD&D and disability insurance

• Commuter benefits

• Take what you need paid time off, not accrual based

• 2 weeks paid time off built into the end of each year (subject to team and business needs)

• 10 paid holidays throughout the calendar year

• Supportive leave of absence program including time off for military service and medical events

• Paid leave for new parents and subsidized back-up care for all parents

• Fertility and family building benefits including but not limited to adoption, surrogacy, and preservation

• Stipend to help with expenses that come with a new child

• Employees can enroll in Palantir’s 401k plan

Life at Palantir

We want every Palantirian to achieve their best outcomes, that’s why we celebrate individuals’ strengths, skills, and interests, from your first interview to your longterm growth, rather than rely on traditional career ladders. Paying attention to the needs of our community enables us to optimize our opportunities to grow and helps ensure many pathways to success at Palantir. Promoting health and well-being across all areas of Palantirians’ lives is just one of the ways we’re investing in our community. Learn more at Life at Palantir and note that our offerings may vary by region.

In keeping consistent with Palantir’s values and culture, we believe employees are “better together” and in-person work affords the opportunity for more creative outcomes. Therefore, we encourage employees to work from our offices to foster connectivity and innovation. Many teams do offer hybrid options (WFH a day or two a week), allowing our employees to strike the right trade-off for their personal productivity. Based on business need, there are a few roles that allow for “Remote” work on an exceptional basis. If you are applying for one of these roles, you must work from the state in which you are employed. If the posting is specified as Onsite, you are required to work from an office.

If you want to empower the world's most important institutions, you belong here. Palantir values excellence regardless of background. We are proud to be an Equal Opportunity Employer for all, including but not limited to Veterans and those with disabilities. Palantir is committed to making the application and hiring process accessible to everyone and will provide a reasonable accommodation for those living with a disability. If you need an accommodation for the application or hiring process_, _please reach out and let us know how we can help.

Please note that you will never be asked to submit a payment or share financial information to participate in our interview process. If you suspect that you've been contacted by a scammer, we recommend you cease all communication with the individual and consider reporting them to the relevant authorities, such as the US FBI Internet Crime Complaint Center (IC3).

If you would like to understand more about how your personal data will be processed by Palantir, please see our Privacy Policy.

A World-Changing Company

The Role

Core Responsibilities

Continuously invest in documentation, metrics, monitors and other troubleshooting tools
Participate in on-call rotations during business hours and occasional weekends. This is a challenging yet rewarding opportunity to help remediate the most pressing issues across the Palantir fleet.
Diagnose, resolve, and prevent issues encountered in the field. Deliver end-to-end improvements to core products based on these issues you encounter in the field.
Improve observability by refactoring codepaths and introducing telemetry
Identify and implement data-driven opportunities for improved service resilience
Develop strategic opinions on stability investments and inform the vision for long-term product stability

What We Value

Comfortable with and curious about large scale production systems and technologies. For example, load balancing, monitoring, distributed systems, and configuration management.
Confidence in troubleshooting complex issues independently using observability tools and stack traces
Familiarity with monitoring tools such as Prometheus and health checks
Experience coding with Java, Go and/or web technologies (e.g. HTML, CSS, JavaScript, Python/Ruby, Django/Flask/Ruby on Rails, etc.) is a plus
Track record of identifying bugs in codebases and contributing fixes leading to long term service stability
Demonstrated ability making data-driven decisions and engaging with stakeholders on strategy

What We Require

Engineering background in Computer Science, Mathematics, Software Engineering, Physics or similar field
Ability to work with a high degree of ownership and a strong sense of urgency in a dynamic environment
Experience producing code in backend languages such as Java, as part of a past role or personal projects
Familiarity with storage and data processing systems and cloud infrastructure
Strong written and verbal communication and ability to iterate quickly with teammates and incorporate feedback
Eligibility and willingness to obtain a US Security clearance

Salary

Benefits

• Employees (and their eligible dependents) can enroll in medical, dental, and vision insurance as well as voluntary life insurance

• Employees are automatically covered by Palantir’s basic life, AD&D and disability insurance

• Commuter benefits

• Take what you need paid time off, not accrual based

• 2 weeks paid time off built into the end of each year (subject to team and business needs)

• 10 paid holidays throughout the calendar year

• Supportive leave of absence program including time off for military service and medical events

• Paid leave for new parents and subsidized back-up care for all parents

• Fertility and family building benefits including but not limited to adoption, surrogacy, and preservation

• Stipend to help with expenses that come with a new child

• Employees can enroll in Palantir’s 401k plan

Life at Palantir

If you would like to understand more about how your personal data will be processed by Palantir, please see our Privacy Policy.

Product Reliability Engineer - Defense

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Core Responsibilities

What We Value

What We Require

Core Responsibilities

What We Value

What We Require