Sr Machine Learning Engineer 5 -- Aep, Agentic System

Adobe Adobe · Enterprise · San Jose, CA

Senior Machine Learning Engineer to build and operate scalable intelligent AI systems for end-user AI products, focusing on agentic systems including orchestration, tool integration, retrieval, memory, evaluation, safety/guardrails, and high-performance backend systems. The role involves end-to-end architecture ownership, building AI-powered product capabilities, designing agent orchestration, developing core platform components (RAG, memory), establishing safety/governance/MLOps, and providing technical leadership.

What you'd actually do

  1. Own end-to-end architecture and delivery of production-grade agentic AI systems—from orchestration and tool execution to retrieval and response generation—built for reliability, scale, and maintainability.
  2. Build AI-powered product capabilities using predictive and generative approaches, enabling autonomous agents that improve customer experience workflows.
  3. Design robust agent orchestration (single- and multi-agent), including planning, delegation, and structured tool use, with strong control flow and failure handling.
  4. Develop core platform components such as retrieval (RAG), memory/state, and LLM/provider abstraction, and drive ongoing improvements through evaluation, monitoring, and experimentation.
  5. Establish safety, governance, and ML Ops guidelines (guardrails, observability, CI/CD, operational readiness) to ensure trustworthy, production-quality outcomes.

Skills

Required

  • Graduate degree with 10+ years of experience, or PhD with 8+ years building and deploying ML systems at scale.
  • Deep expertise in machine learning, end-to-end modeling life cycle, and real-time decisioning architectures.
  • Proven success in building and shipping end-to-end ML systems, from research to deployment and ongoing optimization.
  • Hands-on experience with MLOps, including model lifecycle management, monitoring, automated retraining, CI/CD for ML, and large-scale inference systems.
  • Proficiency in Python and ML frameworks such as PyTorch, TensorFlow, HuggingFace, LangChain, or equivalent.
  • Excellent multi-functional collaboration skills and demonstrated technical leadership.
  • Experience bridging research and production in enterprise-scale AI applications.

Nice to have

  • Some experience with LLMs, agentic systems, prompt engineering, RAG, or context engineering, especially in production environments.

What the JD emphasized

  • end-to-end architecture and delivery
  • production-grade agentic AI systems
  • orchestration
  • tool integration
  • retrieval and memory services
  • evaluation
  • safety/guardrails
  • high-performance backend systems
  • building and shipping end-to-end ML systems
  • MLOps
  • large-scale inference systems
  • enterprise-scale AI applications

Other signals

  • building and operating scalable intelligent AI systems
  • architectural ownership across the agent stack
  • bringing new capabilities from concept to production