Senior Software Engineer I, Inference

Weights & Biases Weights & Biases · Data AI · Bellevue, WA +1 · Technology

CoreWeave is seeking a Senior Software Engineer to own and improve their Kubernetes-native inference platform, focusing on latency, throughput, and reliability. The role involves leading design, implementing optimizations, strengthening incident posture, and mentoring junior engineers. Requires experience with distributed systems, Kubernetes, and inference internals.

What you'd actually do

  1. Lead design reviews and drive architecture within the team; decompose multi-service work into clear milestones.
  2. Define and own SLIs/SLOs; ensure post-incident actions land and reliability improves release-over-release.
  3. Implement advanced optimizations (e.g., micro-batch schedulers, speculative decoding, KV-cache reuse) and quantify impact.
  4. Strengthen incident posture: capacity planning, autoscaling policy, graceful degradation, rollback/traffic-shift strategies.
  5. Mentor IC1/IC2 engineers; review cross-team designs and elevate coding/testing standards.

Skills

Required

  • distributed systems
  • cloud services
  • Python
  • Go
  • networked systems
  • performance
  • Kubernetes
  • CI/CD
  • observability stacks
  • Prometheus
  • Grafana
  • OpenTelemetry
  • inference internals
  • batching
  • caching
  • mixed precision
  • streaming token delivery
  • metrics-driven work

Nice to have

  • C++
  • inference frameworks (vLLM, Triton, TensorRT-LLM, Ray Serve, TorchServe)
  • CUDA kernels
  • NCCL/SHARP
  • RDMA/NUMA
  • GPU interconnect topologies
  • multi-team initiatives
  • customer partnership

What the JD emphasized

  • P99 SLAs
  • tail latency (P95/P99)
  • service reliability
  • Kubernetes at production scale

Other signals

  • inference platform
  • P99 SLAs
  • latency
  • throughput
  • reliability
  • Kubernetes-native