Infrastructure and Platform Engineer, Metal

Tenstorrent · Semiconductors · Santa Clara, CA · AI Software

This role focuses on designing and operating Kubernetes-based platforms on on-prem data centers, enabling engineers and customers to run workloads efficiently on Tenstorrent hardware. The team builds platforms that power internal development, workload orchestration, and hardware allocation across large-scale AI systems.

What you'd actually do

  1. Design and build platform services for workload orchestration, ML services, and internal development workflows.
  2. Develop APIs and systems that enable users and services to interact with infrastructure platforms.
  3. Own Kubernetes-based platforms including cluster lifecycle, scaling, and operational maturity.
  4. Integrate platform systems with CI/CD pipelines, GitOps workflows, and internal tooling.
  5. Partner with SRE, infrastructure, and deployment teams to support large-scale internal and external environments.

Skills

Required

  • Kubernetes
  • Python
  • Go
  • Linux
  • networking fundamentals
  • distributed systems

What the JD emphasized

  • large-scale AI systems
  • custom accelerator hardware
  • large-scale internal and external environments