Currently tracking 83 active AI roles, with 34 new openings in the last 4 weeks. Primary focus: Agent · Engineering. Salary range $139k–$393k (avg $255k).
Scale AI currently has 104 active AI-related job listings. The majority of these roles are focused on agents, representing 34% of the total openings. Engineering is the top function, followed by Research. The company is actively hiring for positions related to model serving, agent orchestration, and evals. Over the last 30 days, Scale AI has added 20 new AI roles, a significant increase of 186% compared to the preceding 30-day period.
Scale AI currently has 102 active AI-related roles in our index. The most common open titles are: Product Manager of AI Applications, Global Public Sector (2), Software Engineer, Robotics (2), Solutions Engineer, Enterprise (2), Machine Learning Research Engineer, Agent Data Foundation - Enterprise GenAI, Senior Software Engineer, Full-Stack – Scale GP. Most positions are in Engineering and Research.
Scale AI's active AI hiring is concentrated in: agents (34%), application (22%), evaluation (15%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.
Scale AI is hiring AI talent in: United States (70 roles), United Kingdom (13 roles), Qatar (2 roles), Mexico (2 roles).
Job postings at Scale AI most frequently reference: agent orchestration, model serving, evals, llm observability, fine tuning.
In the past 30 days, Scale AI has posted 15 new AI-related roles.
| Title | Stage | AI score |
|---|---|---|
| Research Scientist, Frontier Risk Evaluations Research Scientist role focused on designing and building evaluation measures, harnesses, and datasets for frontier AI systems, with a focus on identifying and mitigating risks. The role involves collaboration with external agencies and publishing findings, bridging AI research and policy. | Eval GateAgent | 9 |
| Research Scientist, AI Controls and Monitoring Research Scientist role focused on designing methods, systems, and experiments for AI controls and monitoring, ensuring advanced AI models and agents remain aligned with intended goals, even in high-stakes or adversarial environments. This includes developing monitoring techniques, researching layered control mechanisms, designing red-team simulations, and collaborating with policymakers. |
| 9 |
| Strategic Projects Lead, Red Team Scale AI is seeking a Strategic Projects Lead for their Red Team and Safety function. This role focuses on managing partnerships with frontier AI model developers, stress-testing AI models, and shaping their deployment. The lead will act as a subject-matter expert, coordinate delivery with research and operations, and contribute to public benchmark launches. The role requires technical curiosity, operational rigor, and strong communication skills to bridge technical and commercial audiences. | Eval Gate | 8 |
| Head of Policy & Security Research Lab Lead a team of research scientists, policy experts, and engineers focused on foundational AI safety and security work, including developing frameworks and benchmarks for frontier AI models. The role requires a strong technical and policy background with extensive knowledge of frontier risk evaluations, AI control, and preparedness research. | Eval Gate | 8 |
| Senior Machine Learning Engineer - Model Evaluations, Public Sector This role focuses on building and scaling automated evaluation pipelines for AI systems, including LLMs and agentic models, to ensure their reliability, safety, and effectiveness in mission-critical government environments. It involves designing test datasets, benchmarks, and frameworks for various metrics, including LLM-judge evaluations, agent testing, and stress tests. | Eval GateAgent | 8 |
| Product Manager, Public Sector GenAI Test & Evaluation (T&E) Product Manager for GenAI Test & Evaluation (T&E) in the Public Sector team at Scale AI. This role focuses on defining the vision and roadmap for evaluation capabilities, owning the T&E tech stack to measure and improve agentic applications. Requires strong engineering depth, experience with evaluation systems, problem distillation, ambiguity management, cross-functional leadership, and operational execution. Experience with GenAI implementation, public sector work, and security clearance are preferred. | Eval Gate | 7 |