Principal Software Engineer, Site Reliability

UiPath UiPath · Enterprise · Bucharest, Romania · Engineering

This role is for a Principal Software Engineer on the Site Reliability team, focusing on building and scaling AI-powered platforms and systems that ensure compliance and SLA promises for customers. The role involves designing, engineering, and shipping these systems as products, driving adoption, and improving reliability, scalability, and performance. It requires a strong software engineering background with experience in distributed systems, internal platforms, and AI-powered applications.

What you'd actually do

  1. Design, engineer, and build SRE platform systems and capabilities with cutting-edge AI, treating them as products that other engineering teams depend on in their critical path.
  2. Participate in livesite monitoring rotations, handle escalations, and drive effective mitigations - reducing customer impact through broad, detailed, and effective post-mortems.
  3. Drive availability, scalability, and performance improvements based on livesite learnings. Generate (or codify existing) best practices and ensure they are followed widely across UiPath - not by publishing guidance, but by embedding them into the systems you build.
  4. Ensure technical deliverables meet or exceed expectations on reliability, scalability, quality, and performance. Identify and drive architectural changes that significantly move the needle on these dimensions.
  5. Onboard other teams onto your platforms by driving outcomes yourself - writing the integrations, pairing with their engineers, removing friction - rather than handing off documentation and waiting.

Skills

Required

  • architecting and engineering world-class, large-scale, distributed commercial applications and services
  • building large-scale, complex internal platforms adopted by 10+ teams in their critical path
  • driving adoption of your systems by doing the hard work yourself
  • building and maintaining complex AI-powered applications in production
  • Proficiency in one or more object-oriented languages (such as C#, C++, Java, or Python)
  • solid computer science fundamentals
  • Deep understanding of data structures, algorithms, multithreading, synchronization, asynchronous patterns, and cloud programming
  • Experience with service-oriented and microservice-based architectures, HTTP applications, and web services development
  • Familiarity with modern engineering practices including agile development, CI/CD, and DevOps
  • Ability to work with globally distributed teams

Nice to have

  • Experience working with or managing production Kubernetes infrastructure
  • Experience with cloud providers (Azure, AWS, GCP) and managed services (AKS, GKE, etc.)
  • Experience with database backends (e.g., Azure SQL, CosmosDB, Azure Data Lake, Power BI, MongoDB, MySQL, DynamoDB, etc.)

What the JD emphasized

  • Proven track record (10+ years)
  • Experience building large-scale, complex internal platforms adopted by 10+ teams in their critical path at a large company
  • Demonstrated ability to drive adoption of your systems by doing the hard work yourself
  • Experience building and maintaining complex AI-powered applications in production

Other signals

  • AI-powered applications in production
  • SRE platform systems and capabilities with cutting-edge AI
  • drive availability, scalability, and performance improvements