Senior Machine Learning Engineer

Cresta Cresta · Vertical AI · United States · Remote · Engineering

Senior Machine Learning Engineer role focused on building and scaling next-generation agentic AI systems and LLM-powered applications for the contact center. Responsibilities include designing agent workflows, RAG pipelines, multi-agent orchestration, and developing evaluation strategies for complex, non-deterministic systems. Requires strong expertise in LLMs, modern prompting techniques, and production deployment of ML systems.

What you'd actually do

  1. Lead the design and development of Cresta’s next-generation AI Agents and Agentic Assist systems, defining system architecture and core modeling approaches.
  2. Architect intelligent, multi-step agent workflows that combine real-time guidance, knowledge retrieval, reasoning, summarization, and automated actions into cohesive production systems.
  3. Design, deploy, and optimize LLM-powered systems, including Retrieval-Augmented Generation (RAG) pipelines, multi-agent orchestration, and domain-adapted models.
  4. Develop evaluation strategies for complex, non-deterministic systems, including offline benchmarking, online experimentation, and LLM-as-a-judge methodologies.
  5. Diagnose and mitigate real-world failure modes such as hallucinations, retrieval errors, tool misuse, prompt brittleness, and multi-step reasoning breakdowns.

Skills

Required

  • LLMs
  • modern prompting techniques
  • NLP
  • Generative AI
  • transformer architectures
  • embeddings
  • retrieval systems
  • RAG systems
  • agentic workflows
  • multi-step LLM workflows
  • ML frameworks (PyTorch, TensorFlow, Hugging Face)
  • distributed/cloud-based infrastructure
  • real-time ML systems optimization
  • technical leadership

Nice to have

  • Master’s or Ph.D. preferred
  • pre-LLM ML foundations

What the JD emphasized

  • Lead and build next-generation agentic AI systems
  • deep expertise in LLMs
  • scalable, production-grade systems
  • Design evaluation frameworks
  • diagnosing and mitigating failure modes
  • defining measurable quality metrics
  • Architect and scale LLM and retrieval-augmented generation pipelines
  • building high-performance ML systems
  • extract structured insights
  • deliver real-time, actionable intelligence at scale
  • Lead the design and development
  • defining system architecture
  • core modeling approaches
  • Architect intelligent, multi-step agent workflows
  • cohesive production systems
  • Design, deploy, and optimize LLM-powered systems
  • Retrieval-Augmented Generation (RAG) pipelines
  • multi-agent orchestration
  • domain-adapted models
  • Improve reasoning, planning, and tool-use capabilities
  • real-world AI applications
  • Develop evaluation strategies
  • complex, non-deterministic systems
  • offline benchmarking
  • online experimentation
  • LLM-as-a-judge methodologies
  • Diagnose and mitigate real-world failure modes
  • hallucinations
  • retrieval errors
  • tool misuse
  • prompt brittleness
  • multi-step reasoning breakdowns
  • Define and measure quality metrics
  • accuracy
  • faithfulness
  • task completion
  • latency
  • cost
  • robustness
  • improve system reliability and performance
  • Optimize AI systems for scalability, latency, security, and cost efficiency
  • production environments
  • Bachelor’s degree in Computer Science, Mathematics, or a related field; Master’s or Ph.D. preferred.
  • 5–8+ years of industry experience building and deploying machine learning systems in production
  • significant experience working with LLMs
  • Strong expertise in NLP, Generative AI, transformer architectures, embeddings, and retrieval systems.
  • Proven experience designing and deploying Retrieval-Augmented Generation (RAG) systems in enterprise environments.
  • Experience building and evaluating complex agentic or multi-step LLM workflows.
  • Strong knowledge of modern ML frameworks and tools (e.g., PyTorch, TensorFlow, Hugging Face) and distributed/cloud-based infrastructure.
  • Demonstrated ability to optimize real-time ML systems for performance, scalability, and reliability.
  • Strong technical leadership skills
  • influence cross-functional decisions
  • raise the engineering bar

Other signals

  • LLM-powered agents
  • Agentic AI systems
  • Retrieval-Augmented Generation (RAG)
  • Multi-agent orchestration
  • Evaluation frameworks for LLMs