What you'd actually do

Lead the design and development of Cresta’s next-generation AI Agents and Agentic Assist systems, defining system architecture and core modeling approaches.

Architect intelligent, multi-step agent workflows that combine real-time guidance, knowledge retrieval, reasoning, summarization, and automated actions into cohesive production systems.

Design, deploy, and optimize LLM-powered systems, including Retrieval-Augmented Generation (RAG) pipelines, multi-agent orchestration, and domain-adapted models.

Develop evaluation strategies for complex, non-deterministic systems, including offline benchmarking, online experimentation, and LLM-as-a-judge methodologies.

Diagnose and mitigate real-world failure modes such as hallucinations, retrieval errors, tool misuse, prompt brittleness, and multi-step reasoning breakdowns.

What the JD emphasized

strong pre-LLM ML foundations

deep expertise in LLMs

proven ability to translate cutting-edge research into scalable, production-grade systems

design evaluation frameworks

diagnosing and mitigating failure modes

defining measurable quality metrics

architect and scale LLM and retrieval-augmented generation pipelines

ground models in enterprise data

building high-performance ML systems

extract structured insights

deliver real-time, actionable intelligence at scale

multi-step agent workflows

knowledge retrieval

reasoning

summarization

automated actions

cohesive production systems

Retrieval-Augmented Generation (RAG) pipelines

multi-agent orchestration

domain-adapted models

reasoning, planning, and tool-use capabilities

real-world AI applications

evaluation strategies for complex, non-deterministic systems

offline benchmarking

online experimentation

LLM-as-a-judge methodologies

hallucinations

retrieval errors

tool misuse

prompt brittleness

multi-step reasoning breakdowns

accuracy

faithfulness

task completion

latency

cost

robustness

scalability

latency

security

cost efficiency

production environments

Bachelor’s degree in Computer Science, Mathematics, or a related field; Master’s or Ph.D. preferred.

5–8+ years of industry experience building and deploying machine learning systems in production, including significant experience working with LLMs.

Strong expertise in NLP, Generative AI, transformer architectures, embeddings, and retrieval systems.

Proven experience designing and deploying Retrieval-Augmented Generation (RAG) systems in enterprise environments.

Experience building and evaluating complex agentic or multi-step LLM workflows.

Strong knowledge of modern ML frameworks and tools (e.g., PyTorch, TensorFlow, Hugging Face) and distributed/cloud-based infrastructure.

Demonstrated ability to optimize real-time ML systems for performance, scalability, and reliability.

Strong technical leadership skills, with the ability to influence cross-functional decisions and raise the engineering bar.

Cresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Our platform combines the best of AI and human intelligence to help contact centers discover customer insights and behavioral best practices, automate conversations and inefficient processes, and empower every team member to work smarter and faster. Born from the prestigious Stanford AI lab, Cresta's co-founder and chairman isSebastian Thrun, the genius behind Google X, Waymo, Udacity, and more. Our leadership also includes CEO,Ping Wu, the co-founder of Google Contact Center AI and Vertex AI platform,and co-founder, Tim Shi, an early member of Open AI.

Join us on this thrilling journey to revolutionize the workforce with AI. The future of work is here, and it's at Cresta.

About the role:

Machine Learning Engineers at Cresta work across several high-impact AI initiatives. Final team placement is determined based on experience, strengths, and business needs.

Current focus areas include:

Agentic Assist: Lead and build next-generation agentic AI systems that augment contact center agents in real time. This track requires strong pre-LLM ML foundations, deep expertise in LLMs and modern prompting techniques, a rapid prototyping mindset, and a proven ability to translate cutting-edge research into scalable, production-grade systems.
Agent & System Quality: Design evaluation frameworks and improve the reliability, robustness, and performance of LLM-powered agents. This includes diagnosing and mitigating failure modes such as hallucinations, retrieval errors, tool misuse, context drift, prompt brittleness, and multi-step reasoning breakdowns, while defining measurable quality metrics (e.g., accuracy, faithfulness, task completion, latency, and cost) for complex, non-deterministic systems.
Insights: Architect and scale LLM and retrieval-augmented generation pipelines that ground models in enterprise data. This track focuses on building high-performance ML systems that process complex data, extract structured insights, and deliver real-time, actionable intelligence at scale.

Responsibilities:

Lead the design and development of Cresta’s next-generation AI Agents and Agentic Assist systems, defining system architecture and core modeling approaches.
Architect intelligent, multi-step agent workflows that combine real-time guidance, knowledge retrieval, reasoning, summarization, and automated actions into cohesive production systems.
Design, deploy, and optimize LLM-powered systems, including Retrieval-Augmented Generation (RAG) pipelines, multi-agent orchestration, and domain-adapted models.
Improve reasoning, planning, and tool-use capabilities in real-world AI applications.
Develop evaluation strategies for complex, non-deterministic systems, including offline benchmarking, online experimentation, and LLM-as-a-judge methodologies.
Diagnose and mitigate real-world failure modes such as hallucinations, retrieval errors, tool misuse, prompt brittleness, and multi-step reasoning breakdowns.
Define and measure quality metrics (e.g., accuracy, faithfulness, task completion, latency, cost, robustness) to improve system reliability and performance.
Optimize AI systems for scalability, latency, security, and cost efficiency in production environments.
Collaborate cross-functionally with product, frontend, and backend teams to integrate AI capabilities seamlessly into Cresta’s platform.
Mentor engineers, contribute to technical strategy, and help shape the roadmap for Cresta’s AI systems.

Qualifications We Value:

Bachelor’s degree in Computer Science, Mathematics, or a related field; Master’s or Ph.D. preferred.
5–8+ years of industry experience building and deploying machine learning systems in production, including significant experience working with LLMs.
Strong expertise in NLP, Generative AI, transformer architectures, embeddings, and retrieval systems.
Proven experience designing and deploying Retrieval-Augmented Generation (RAG) systems in enterprise environments.
Experience building and evaluating complex agentic or multi-step LLM workflows.
Strong knowledge of modern ML frameworks and tools (e.g., PyTorch, TensorFlow, Hugging Face) and distributed/cloud-based infrastructure.
Demonstrated ability to optimize real-time ML systems for performance, scalability, and reliability.
Strong technical leadership skills, with the ability to influence cross-functional decisions and raise the engineering bar.

Perks & Benefits:

We offer Cresta employees a variety of medical, dental, and vision plans, designed to fit you and your family’s needs
Paid parental leave to support you and your family
Monthly Health & Wellness allowance
Work from home office stipend to help you succeed in a remote environment
Lunch reimbursement for in-office employees
PTO: 3 weeks in Canada

Compensation for this position includes a base salary, equity, and a variety of benefits. Actual base salaries will be based on candidate-specific factors, including experience, skillset, and location, and local minimum pay requirements as applicable. We are actively hiring for this role in the US and Canada. Your recruiter can provide further details.

This posting will be used to fill a newly-created role.

We have noticed a rise in recruiting impersonations across the industry, where scammers attempt to access candidates' personal and financial information through fake interviews and offers. All Cresta recruiting email communications will always come from the @cresta.ai domain. Any outreach claiming to be from Cresta via other sources should be ignored. If you are uncertain whether you have been contacted by an official Cresta employee, reach out to recruiting@cresta.ai

Join us on this thrilling journey to revolutionize the workforce with AI. The future of work is here, and it's at Cresta.

About the role:

Machine Learning Engineers at Cresta work across several high-impact AI initiatives. Final team placement is determined based on experience, strengths, and business needs.

Current focus areas include:

Agentic Assist: Lead and build next-generation agentic AI systems that augment contact center agents in real time. This track requires strong pre-LLM ML foundations, deep expertise in LLMs and modern prompting techniques, a rapid prototyping mindset, and a proven ability to translate cutting-edge research into scalable, production-grade systems.
Agent & System Quality: Design evaluation frameworks and improve the reliability, robustness, and performance of LLM-powered agents. This includes diagnosing and mitigating failure modes such as hallucinations, retrieval errors, tool misuse, context drift, prompt brittleness, and multi-step reasoning breakdowns, while defining measurable quality metrics (e.g., accuracy, faithfulness, task completion, latency, and cost) for complex, non-deterministic systems.
Insights: Architect and scale LLM and retrieval-augmented generation pipelines that ground models in enterprise data. This track focuses on building high-performance ML systems that process complex data, extract structured insights, and deliver real-time, actionable intelligence at scale.

Responsibilities:

Lead the design and development of Cresta’s next-generation AI Agents and Agentic Assist systems, defining system architecture and core modeling approaches.
Architect intelligent, multi-step agent workflows that combine real-time guidance, knowledge retrieval, reasoning, summarization, and automated actions into cohesive production systems.
Design, deploy, and optimize LLM-powered systems, including Retrieval-Augmented Generation (RAG) pipelines, multi-agent orchestration, and domain-adapted models.
Improve reasoning, planning, and tool-use capabilities in real-world AI applications.
Develop evaluation strategies for complex, non-deterministic systems, including offline benchmarking, online experimentation, and LLM-as-a-judge methodologies.
Diagnose and mitigate real-world failure modes such as hallucinations, retrieval errors, tool misuse, prompt brittleness, and multi-step reasoning breakdowns.
Define and measure quality metrics (e.g., accuracy, faithfulness, task completion, latency, cost, robustness) to improve system reliability and performance.
Optimize AI systems for scalability, latency, security, and cost efficiency in production environments.
Collaborate cross-functionally with product, frontend, and backend teams to integrate AI capabilities seamlessly into Cresta’s platform.
Mentor engineers, contribute to technical strategy, and help shape the roadmap for Cresta’s AI systems.

Qualifications We Value:

Bachelor’s degree in Computer Science, Mathematics, or a related field; Master’s or Ph.D. preferred.
5–8+ years of industry experience building and deploying machine learning systems in production, including significant experience working with LLMs.
Strong expertise in NLP, Generative AI, transformer architectures, embeddings, and retrieval systems.
Proven experience designing and deploying Retrieval-Augmented Generation (RAG) systems in enterprise environments.
Experience building and evaluating complex agentic or multi-step LLM workflows.
Strong knowledge of modern ML frameworks and tools (e.g., PyTorch, TensorFlow, Hugging Face) and distributed/cloud-based infrastructure.
Demonstrated ability to optimize real-time ML systems for performance, scalability, and reliability.
Strong technical leadership skills, with the ability to influence cross-functional decisions and raise the engineering bar.

Perks & Benefits:

We offer Cresta employees a variety of medical, dental, and vision plans, designed to fit you and your family’s needs
Paid parental leave to support you and your family
Monthly Health & Wellness allowance
Work from home office stipend to help you succeed in a remote environment
Lunch reimbursement for in-office employees
PTO: 3 weeks in Canada

This posting will be used to fill a newly-created role.

Senior Machine Learning Engineer

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

About the role:

About the role: