Software Engineer, Infrastructure

Sierra Sierra · AI Frontier · San Francisco, CA · Engineering

Software Engineer, Infrastructure role focused on designing, building, and maintaining core systems for an AI platform, specifically emphasizing LLM inference serving, cloud infrastructure (AWS, GCP, Azure) with Terraform, CI/CD, distributed systems, and observability tooling to ensure reliability, scalability, and performance.

What you'd actually do

  1. Ensure the reliability, scalability, and performance of our platform and LLM inference serving as we rapidly grow traffic.
  2. Build and maintain cloud infrastructure using Terraform to ensure scalable, secure, and reproducible environments.
  3. Create and maintain a self-serve infrastructure platform that enables the rest of engineering to deploy and operate services.
  4. Own and evolve CI/CD pipelines and release management, enabling fast, reliable deployments for Sierra’s platform.
  5. Architect and operate distributed systems that leverage distributed databases, retrieval systems, and ML models.

Skills

Required

  • 5-7+ years of hands-on development experience
  • building automation, tooling, and platform
  • designing maintainable systems
  • cloud platforms (AWS, GCP, or Azure)
  • infrastructure as code (Terraform preferred)
  • CI/CD systems
  • release management
  • container orchestration (e.g., Docker, Kubernetes)
  • observability tools (Prometheus, Grafana, Datadog, OpenTelemetry, etc.)
  • incident response
  • operating distributed systems in production
  • Degree in Computer Science or related field, or equivalent professional experience

Nice to have

  • Production experience working with LLMs and machine learning models
  • Background in distributed systems
  • running SaaS services at scale
  • agentic architecture
  • security and authentication protocols (OAuth, SSO, mTLS)
  • fast-paced startup environment
  • platform/infra-focused team

What the JD emphasized

  • LLM inference serving
  • cloud infrastructure
  • Terraform
  • CI/CD
  • distributed systems
  • observability tooling

Other signals

  • LLM inference serving
  • cloud infrastructure
  • self-serve infrastructure platform
  • CI/CD pipelines
  • distributed systems
  • observability tooling