Senior Engineer, Compute Services (kubernetes, Bare Metal)

Weights & Biases Weights & Biases · Data AI · Bellevue, WA +2 · Technology

Senior Engineer, Compute Services role at CoreWeave focused on building and maintaining fault-tolerant Kubernetes infrastructure on bare-metal, using Python, Golang, and Bash. Responsibilities include provisioning, lifecycle management, optimization, and automated testing of Kubernetes control planes. Requires strong DevOps and Linux troubleshooting skills, with experience in Ansible and CI/CD tools. On-call rotation is expected.

What you'd actually do

  1. Design, develop, and maintain automated tooling to provision Kubernetes control planes on bare-metal
  2. Use Python, Golang, and Bash to create tooling and go operators
  3. Perform day 2 lifecycle tasks and maintenance on running clusters
  4. Identify gaps and implement fault-tolerant architectures
  5. Optimize reliability using the Grafana ecosystem

Skills

Required

  • Kubernetes provisioning (kubeadm, Cluster API, Kubeception, Kubespray, or similar)
  • Debugging complex Kubernetes cluster issues
  • Golang
  • Bash
  • Python
  • Advanced Linux OS troubleshooting
  • Ansible
  • Advanced DevOps experience (GitLab CI, GitHub Actions)
  • Collaboration on shared codebases
  • Documentation
  • Analytical and problem-solving abilities
  • On-call rotation experience

Nice to have

  • Bare-metal OS provisioning experience
  • Kubernetes operator coding experience
  • Advanced Linux networking expertise
  • AWX/Ansible tower knowledge