Scale AI currently has 104 active AI-related job listings. The majority of these roles are focused on agents, representing 34% of the total openings. Engineering is the top function, followed by Research. The company is actively hiring for positions related to model serving, agent orchestration, and evals. Over the last 30 days, Scale AI has added 20 new AI roles, a significant increase of 186% compared to the preceding 30-day period.
Currently tracking 83 active AI roles, with 34 new openings in the last 4 weeks. Primary focus: Agent · Engineering. Salary range $139k–$393k (avg $255k).
Scale AI currently has 102 active AI-related roles in our index. The most common open titles are: Product Manager of AI Applications, Global Public Sector (2), Software Engineer, Robotics (2), Solutions Engineer, Enterprise (2), Machine Learning Research Engineer, Agent Data Foundation - Enterprise GenAI, Senior Software Engineer, Full-Stack – Scale GP. Most positions are in Engineering and Research.
Scale AI's active AI hiring is concentrated in: agents (34%), application (22%), evaluation (15%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.
Scale AI is hiring AI talent in: United States (70 roles), United Kingdom (13 roles), Qatar (2 roles), Mexico (2 roles).
Job postings at Scale AI most frequently reference: agent orchestration, model serving, evals, llm observability, fine tuning.
In the past 30 days, Scale AI has posted 15 new AI-related roles.
| Title | Stage | AI score |
|---|---|---|
| Research Scientist, Frontier Risk Evaluations Research Scientist role focused on designing and building evaluation measures, harnesses, and datasets for frontier AI systems, with a focus on identifying and mitigating risks. The role involves collaboration with external agencies and publishing findings, bridging AI research and policy. | Eval GateAgent | 9 |
| Research Scientist, AI Controls and Monitoring Research Scientist role focused on designing methods, systems, and experiments for AI controls and monitoring, ensuring advanced AI models and agents remain aligned with intended goals, even in high-stakes or adversarial environments. This includes developing monitoring techniques, researching layered control mechanisms, designing red-team simulations, and collaborating with policymakers. |
| 9 |
| Staff Machine Learning Research Scientist, LLM Evals Scale AI is seeking a Staff Machine Learning Research Scientist to lead the development of novel evaluation methodologies, metrics, and benchmarks for large language models (LLMs). This role focuses on defining and measuring the capabilities and limitations of frontier LLMs, driving research that informs internal roadmaps and the broader community. Responsibilities include researching existing evaluation techniques, designing new benchmarks, implementing scalable evaluation pipelines, publishing findings, and mentoring junior researchers. The ideal candidate has 5+ years of experience in LLMs/NLP, a strong publication record, and experience leading research teams. | Eval GatePost-train | 9 |
| Tech Lead/Manager, Machine Learning Research Scientist- LLM Evals Scale AI is seeking a Tech Lead/Manager for their LLM Evals Research team. This role involves leading a team to develop and implement novel evaluation methodologies, metrics, and benchmarks for large language models, focusing on areas like instruction following, factuality, robustness, and fairness. The position requires research into LLM evaluation techniques, communication with clients and internal teams, implementation of scalable evaluation pipelines, and publishing research findings. The ideal candidate has extensive experience in LLMs, NLP, and Transformer modeling, with a proven track record of research impact and team leadership. | Eval GatePost-train | 9 |
| Head of Policy & Security Research Lab Lead a team of research scientists, policy experts, and engineers focused on foundational AI safety and security work, including developing frameworks and benchmarks for frontier AI models. The role requires a strong technical and policy background with extensive knowledge of frontier risk evaluations, AI control, and preparedness research. | Eval Gate | 8 |
| Research Advisor - Human Frontier Collective (UK) Independent contractor opportunity for a Research Advisor to join the Human Frontier Collective (HFC) at Scale AI. The role involves providing consultancy on model behavior and domain-specific logic, collaborating on research to design evaluation frameworks for frontier models, engaging with clients as a Subject Matter Expert, creating technical content, and contributing to research publications. The role requires 5+ years of relevant industry experience with strong domain knowledge in fields like finance, legal, or medical, and advanced degrees. The compensation is $300/hr USD. | Eval Gate | 8 |
| Research Advisor - Human Frontier Collective (US) This role involves providing expert consultancy on AI model behavior and governance, collaborating on research to design evaluation frameworks for frontier models, engaging with clients as a Subject Matter Expert, creating technical content, and co-authoring research publications. The focus is on evaluating and interpreting advanced generative AI systems, particularly in specialized domains like finance, legal, or medical. | Eval Gate | 8 |
| SWE Fellow - Human Frontier Collective (Canada) This role focuses on evaluating and interpreting advanced generative AI systems, designing datasets for rigorous evaluation, and contributing to research publications. It involves collaborating with AI labs and platforms to enhance model accuracy and reasoning, positioning it as a key player in the AI evaluation gate. | Eval Gate | 8 |
| SWE Fellow - Human Frontier Collective (UK) This role is for a Software Engineer Fellow focused on evaluating and interpreting advanced generative AI systems within the Human Frontier Collective program. The fellow will design datasets, evaluate AI models, and contribute to research publications, aiming to enhance AI accuracy and reasoning. | Eval Gate | 8 |
| SWE Fellow - Human Frontier Collective (US) This role focuses on evaluating advanced generative AI systems by designing domain-specific problems and datasets, providing expert insights to enhance model performance, and contributing to research publications. It's a research-oriented role within a fellowship program aimed at shaping the future of AI. | Eval Gate | 8 |
| STEM Fellow - Human Frontier Collective (UK) This role focuses on evaluating and interpreting advanced generative AI systems by designing domain-specific problems and datasets, providing expert insights to enhance model performance, and contributing to research publications. It's a research-oriented fellowship with a focus on AI evaluation. | Eval Gate | 8 |
| STEM Fellow - Human Frontier Collective (US) This role focuses on evaluating and interpreting advanced generative AI systems by designing domain-specific problems and datasets, providing expert insights to enhance model performance, and contributing to research publications. It involves collaboration with AI labs and Scale's research team. | Eval GatePost-train | 8 |
| Medical Fellow - Human Frontier Collective (UK) This role involves a medical professional (MD, DO) collaborating with AI labs to evaluate generative AI systems, focusing on safety, accuracy, and reasoning frameworks within healthcare. The goal is to apply clinical expertise to shape AI decision-making and contribute to research publications. The role is a 6-month independent contractor position with potential for extension. | Eval GatePost-train | 7 |
| Medical Fellow - Human Frontier Collective (US) This role is for a Medical Fellow to collaborate on high-impact projects evaluating and interpreting advanced generative AI systems in healthcare. The fellow will design clinical scenarios, evaluate model safety and accuracy, shape reasoning frameworks, and contribute to research publications. Prior AI experience is not required but is a plus. | Eval Gate | 7 |