Principal Software Engineer, Site Reliability

UiPath UiPath · Enterprise · Bellevue, WA · Engineering

UiPath is seeking a Principal Software Engineer for their Site Reliability team. This role focuses on building and scaling SRE platform systems and capabilities, increasingly powered by AI. The engineer will design, engineer, and build these systems as products, participate in livesite monitoring, drive availability and performance improvements, and ensure technical deliverables meet high standards. A key aspect is driving adoption of these platforms by directly engaging with other teams and embedding best practices into the systems themselves. The role requires a strong background in distributed systems, experience with AI-powered applications in production, and proficiency in object-oriented programming.

What you'd actually do

  1. Design, engineer, and build SRE platform systems and capabilities with cutting-edge AI, treating them as products that other engineering teams depend on in their critical path.
  2. Participate in livesite monitoring rotations, handle escalations, and drive effective mitigations - reducing customer impact through broad, detailed, and effective post-mortems.
  3. Drive availability, scalability, and performance improvements based on livesite learnings. Generate (or codify existing) best practices and ensure they are followed widely across UiPath - not by publishing guidance, but by embedding them into the systems you build.
  4. Ensure technical deliverables meet or exceed expectations on reliability, scalability, quality, and performance. Identify and drive architectural changes that significantly move the needle on these dimensions.
  5. Onboard other teams onto your platforms by driving outcomes yourself - writing the integrations, pairing with their engineers, removing friction - rather than handing off documentation and waiting.

Skills

Required

  • Proven track record (10+ years) of architecting and engineering world-class, large-scale, distributed commercial applications and services, and ensuring customer success.
  • Experience building large-scale, complex internal platforms adopted by 10+ teams in their critical path at a large company
  • Demonstrated ability to drive adoption of your systems by doing the hard work yourself: writing integrations, removing friction for other teams, and measuring success by outcomes delivered - not features shipped.
  • Experience building and maintaining complex AI-powered applications in production.
  • Proficiency in one or more object-oriented languages (such as C#, C++, Java, or Python), backed by solid computer science fundamentals.
  • Deep understanding of data structures, algorithms, multithreading, synchronization, asynchronous patterns, and cloud programming.
  • Experience with service-oriented and microservice-based architectures, HTTP applications, and web services development.
  • Familiarity with modern engineering practices including agile development, CI/CD, and DevOps.
  • Ability to work with globally distributed teams.

Nice to have

  • Experience working with or managing production Kubernetes infrastructure is a plus.
  • Experience with cloud providers (Azure, AWS, GCP) and managed services (AKS, GKE, etc.) is a plus.
  • Experience with database backends (e.g., Azure SQL, CosmosDB, Azure Data Lake, Power BI, MongoDB, MySQL, DynamoDB, etc.).

What the JD emphasized

  • Proven track record (10+ years) of architecting and engineering world-class, large-scale, distributed commercial applications and services, and ensuring customer success.
  • Experience building large-scale, complex internal platforms adopted by 10+ teams in their critical path at a large company — systems that have stood the test of time, not prototypes that were handed off or abandoned.
  • Demonstrated ability to drive adoption of your systems by doing the hard work yourself: writing integrations, removing friction for other teams, and measuring success by outcomes delivered - not features shipped.
  • Experience building and maintaining complex AI-powered applications in production.

Other signals

  • building platforms and systems
  • increasingly powered by AI
  • treat every system you ship as a product
  • drive adoption of your systems