Weights & Biases

Scaling

Data AI · ML experiment tracking

Currently tracking 21 active AI roles, up 26% versus the prior 4 weeks. Primary focus: Serve · Engineering. Salary range $92k–$341k (avg $209k).

Hiring
21 / 21
Momentum (4w)
+14 +26%
67 opens last 4w · 53 prior 4w
Salary range · avg $209k
$92k–$341k
USD · disclosed roles only
Tracked since
Aug '24
last role today
Hiring velocityscroll left for older weeks
1 new role
Jul 15
1 new role
Aug 19
1 new role
Oct 21
1 new role
Mar 17
1 new role
Apr 7
4 new roles
14
2 new roles
21
4 new roles
May 12
2 new roles
19
2 new roles
Jun 16
1 new role
23
1 new role
30
1 new role
Jul 7
1 new role
21
1 new role
Aug 11
1 new role
18
2 new roles
Sep 8
1 new role
15
2 new roles
22
1 new role
29
2 new roles
Oct 6
2 new roles
20
1 new role
27
1 new role
Nov 3
6 new roles
10
2 new roles
17
2 new roles
24
2 new roles
Dec 1
1 new role
8
6 new roles
15
1 new role
22
1 new role
29
5 new roles
Jan 5
7 new roles
12
11 new roles
19
15 new roles
26
14 new roles
Feb 2
10 new roles
9
4 new roles
16
7 new roles
23
7 new roles
Mar 2
9 new roles
9
12 new roles
16
10 new roles
23
15 new roles
30
16 new roles
Apr 6
10 new roles
13
19 new roles
20
14 new roles
27
24 new roles
May 4

Jobs (21)

21 AI · 259 total active
TitleStageFunctionLocationFirst seenAI score
VP of Product, Research and Training Infrastructure
VP of Product for Research and Training Infrastructure at an AI cloud provider. This role owns the product strategy and engineering execution for services powering AI research labs, focusing on specialized orchestration, evaluation, and iteration tools for massive-scale pre-training and post-training. Key responsibilities include evolving orchestration tools (SUNK), developing automated training-based evaluation frameworks, and building infrastructure for RL/RLHF pipelines. Requires deep knowledge of HPC, distributed training, and supporting frontier model research.
PretrainPost-trainProductBellevue, WA +47w ago9
Staff AI Security Engineer
Staff AI Security Engineer to define and operationalize security across CoreWeave's AI ecosystem, focusing on secure-by-default foundations for AI development, agentic workflows, and enterprise AI adoption. The role involves building secure infrastructure, developing AI security policies, implementing guardrails for agentic systems, leading secure adoption of AI tools, and conducting adversarial testing.
AgentServeEngineeringBellevue, WA +41w ago8
AI Solutions Engineer, Pre-Sales- W&B
AI Solutions Engineer focused on helping customers design, deploy, and scale ML and GenAI systems using Weights & Biases and CoreWeave's AI cloud. This role involves technical depth, customer engagement, architecting solutions for distributed training, RAG, agents, fine-tuning, and inference.
AgentServeEngineeringBellevue, WA +5Jan 268
Principal Engineer - Perf and Benchmarking
Principal Engineer role focused on leading the Benchmarking & Performance team at CoreWeave, a cloud provider for AI. The role involves defining strategy, leading end-to-end MLPerf submissions (Training & Inference), designing and implementing a Kubernetes-native benchmarking service for latency and throughput, and building CI/CD pipelines for scale. It requires deep expertise in distributed systems, GPU performance, model-serving stacks, and Kubernetes, with a focus on achieving industry-leading performance data and publications.
ServeEval GateEngineeringBellevue, WA +1Dec '258
Staff Software Engineer, Inference
Staff Software Engineer on the Inference Platform Team at CoreWeave, focusing on building and operating a Kubernetes-native inference platform for AI workloads. The role involves technical leadership in architecture, performance optimization (latency, throughput, GPU utilization), and system reliability for low-latency, high-throughput systems at massive scale, with deep work in distributed systems and Kubernetes infrastructure.
ServeEngineeringBellevue, WA +13d ago7
Staff Technical Program Manager - Cluster Orchestration & Applied Training
Staff Technical Program Manager to lead cross-functional programs for AI/ML Platform Services, focusing on Cluster Orchestration (scheduling, launching, managing AI workloads) and Applied Training (enabling researchers to use infrastructure for pre-training, fine-tuning, RL, evaluations). The role involves partnering with engineering, product, and research teams to improve workload execution and user interaction with training platforms, driving delivery across various AI training workflows and ensuring successful launches and operational ownership.
ServePost-trainEngineeringBellevue, WA4d ago7
Senior Software Engineer, Applied AI
Senior Software Engineer to design and build production-grade, full-stack AI-native analytics platforms and first-party applications that embed governed data directly into operational workflows. The role involves developing AI-enabled user experiences, scalable backend services, and intuitive interfaces, integrating AI/LLM capabilities into real-world applications.
ShipEngineeringBellevue, WA +15w ago7
Principal Engineer, Cluster Orchestration
CoreWeave is seeking a Principal Engineer to lead the design and evolution of their AI infrastructure's cluster orchestration systems, including Slurm, Kubernetes, and SUNK. This role involves defining long-term architecture, solving scaling problems, and ensuring the reliability and efficiency of GPU resource utilization for AI training and inference workloads.
ServeEngineeringBellevue, WA +1Feb 277
Staff Product Manager, Insights
Staff Product Manager for CoreWeave's Insights team, focusing on developing AI-powered observability experiences for AI workloads. The role involves defining strategy, roadmaps, and metrics for dashboards, alerts, and AI-driven insights to help customers understand performance, reliability, and cost in their cloud environments. Key responsibilities include translating telemetry into actionable insights and driving proactive surfacing of information, particularly for cost optimization and workload efficiency.
AgentEval GateProductBellevue, WA +5 · RemoteFeb 187
Senior Security Engineer II, Vulnerability Management
Senior Security Engineer to build and scale AI-powered vulnerability management programs for AI infrastructure. This role involves architecting automation systems, driving risk-based prioritization, and influencing automation priorities. It requires strong development skills in Python/Go and experience with modern security tooling, with a focus on applying AI/ML to security workflows.
DataAgentEngineeringBellevue, WA +4Feb 67
Senior Software Engineer, Observability Insights
Senior Software Engineer to lead development of agentic interfaces and product experiences for AI system observability, focusing on multi-tenant APIs, Grafana, and tool servers. Requires experience in backend systems, distributed APIs, reliability engineering, and agentic applications/LLM features.
AgentServeEngineeringNew York, NY +1Feb 27
Solutions Architect - HPC/AI/ML
Solutions Architect role focused on supporting customers running AI/ML workloads on CoreWeave's HPC cloud infrastructure, with an emphasis on AI/ML inference. Responsibilities include technical customer contact, solution design, proof of concept development, and workload optimization. Requires expertise in cloud computing, distributed systems, AI/ML inference, NVIDIA GPUs, and Kubernetes.
ServeEngineeringSingaporeJan 297
Senior Software Engineer II, Applied Training
Senior Software Engineer II, Applied Training at CoreWeave, focusing on building and scaling Kubernetes-native research cluster platforms and sandbox client infrastructure for agentic training and evaluation. The role aims to provide AI labs with advanced research infrastructure, enabling them to focus on model training rather than operations. Responsibilities include contributing to the roadmap, designing cluster experiences, owning SDKs for agent rollouts and benchmarks, writing documentation, and working closely with large AI labs.
ServeAgentEngineeringBellevue, WA +2Jan 237
Staff Software Engineer, Applied Training
CoreWeave is seeking a Staff Software Engineer to join their Applied Training team. This role will focus on building and improving their Kubernetes-native research cluster platform and sandbox client for agentic training and evaluation. The goal is to provide AI researchers with the infrastructure needed to train models efficiently, abstracting away operational complexities. Responsibilities include contributing to the roadmap, designing and building cluster experiences, owning the Python SDK for agentic workflows, and documenting training frameworks. The ideal candidate has extensive experience in distributed systems, ML infrastructure, or developer platforms, with strong Kubernetes expertise and familiarity with AI training and agentic workflows.
ServeAgentEngineeringBellevue, WA +2Jan 237
Senior Software Engineer I, Inference
CoreWeave is seeking a Senior Software Engineer to own and improve their Kubernetes-native inference platform, focusing on latency, throughput, and reliability. The role involves leading design, implementing optimizations, strengthening incident posture, and mentoring junior engineers. Requires experience with distributed systems, Kubernetes, and inference internals.
ServeEngineeringBellevue, WA +1Jan 237
Sr. Software Engineer - Perf and Benchmarking
Senior Software Engineer focused on performance and benchmarking of AI infrastructure, including Kubernetes-native services, MLPerf runs, and model-serving stacks. The role involves building and improving services to measure latency, throughput, and cost, and ensuring reproducible benchmarking processes.
ServeEval GateEngineeringBellevue, WA +1Dec '257
Senior Software Engineer (Full-Stack + Agentic AI)
Senior Software Engineer role focused on developing AI agents and full-stack applications for internal enterprise systems. The role involves using frameworks like LangChain and LangGraph, building backend services, and integrating with various enterprise systems to automate tasks in Finance, Billing, and Supply Chain.
AgentEngineeringBellevue, WA +1Dec '257
Software Engineer, Inference AI/ML
Software Engineer focused on improving the latency, reliability, and cost of model serving on a GPU platform, working with services like Triton, vLLM, and TensorRT-LLM.
ServeEngineeringBellevue, WA +1Oct '257
Senior Software Engineer II, Inference
Senior Software Engineer II focused on owning and optimizing CoreWeave's Kubernetes-native inference platform to meet strict P99 SLAs at scale. Responsibilities include leading design reviews, implementing advanced optimizations for latency and throughput, strengthening incident posture, and mentoring junior engineers. Requires strong experience in distributed systems, Python/Go, networked systems performance, Kubernetes, and ML inference internals.
ServeEngineeringBellevue, WA +1Sep '257
Solutions Architect - HPC/AI/ML
Solutions Architect role focused on AI/ML inference workloads on high-performance compute (HPC) infrastructure, primarily using Kubernetes and NVIDIA GPUs. The role involves customer technical contact, solution design, proof of concept, workload optimization, and providing feedback to product teams.
ServeEngineeringBellevue, WA +4Oct '247
Senior Systems Engineer, OS Automation
Senior Systems Engineer focused on automating and scaling Linux OS and Kernel build pipelines, with a strong emphasis on integrating AI/ML technologies like LLMs, RAG, and predictive modeling to create AI-native infrastructure, smart CI/CD, auto-remediation, and predictive regression detection.
ServeAgentEngineeringBellevue, WA +3Aug '247