Staff Software Developer - Enterprise AI

Rubrik Rubrik · Enterprise · Palo Alto, CA · Information Technology & Services

Staff Software Developer for the Enterprise AI team, focusing on building and operating the platform and backend systems that power AI workflows at Rubrik. This role involves owning core infrastructure, designing distributed services, and integrating with AI tooling.

What you'd actually do

  1. Own end-to-end delivery of major platform initiatives, from design through deployment and post-launch success.
  2. Own Kubernetes at depth — clusters, networking, operators, container lifecycle, and multi-tenant orchestration.
  3. Design, develop, and optimize distributed services and cloud-native infrastructure on AWS and/or GCP for scale, reliability, and performance.
  4. Drive engineering excellence through code quality standards, design reviews, automation, and CI/CD best practices.
  5. Collaborate across teams — Product, AI, and Security — to align architecture with business objectives.

Skills

Required

  • 6+ years of software engineering with deep backend and infrastructure focus
  • Python and/or Go
  • Kubernetes (building and operating clusters)
  • Designing and operating distributed systems in production
  • AWS and/or GCP (compute, storage, IAM, networking, managed services)
  • Infrastructure-as-code (Terraform or similar)
  • CI/CD pipelines
  • Familiarity with applied AI tooling and patterns (agentic AI tools, AI gateways, agent frameworks)
  • System design and architectural judgment
  • Communication and collaboration skills

Nice to have

  • Observability stacks (Prometheus, Grafana, Datadog, OpenTelemetry)
  • Multi-cloud or hybrid infrastructure experience
  • API gateways, AI gateways, and policy/authorization frameworks (ABAC, OPA)
  • Service mesh or platform-as-a-service design experience
  • Improving engineering productivity at scale

What the JD emphasized

  • Strong programming skills in Python and/or Go
  • Deep, hands-on Kubernetes experience
  • Cloud-native fluency across AWS and/or GCP
  • Familiarity with applied AI tooling and patterns — agentic AI tools (Claude, LiteLLM), AI gateways, agent frameworks

Other signals

  • platform engineering
  • distributed systems
  • Kubernetes
  • AWS/GCP
  • AI enablement