AI Tech Lead - Staff Machine Learning Engineer

Sumo Logic Sumo Logic · Enterprise · United States · Software Engineering

Lead the design and delivery of next-generation Agentic AI systems for Security Operation Center (Agentic SOC), evaluating, prototyping, and productionizing state-of-the-art agentic AI technologies and building scalable multi-agent architectures.

What you'd actually do

  1. Lead and partner with fellow leadership members and teams on technical evaluation and adoption of cutting-edge agentic AI platforms, including Anthropic (Claude), LangChain/LangGraph, AWS Bedrock, and other emerging agent frameworks.
  2. Architect, prototype, and productionize multi-agent AI systems for Agentic SOC use cases, including detection, triage, investigation, and response workflows.
  3. Own the design of core agent architecture components, including planning, execution, tool orchestration, memory, context engineering, and long-running agent workflows.
  4. Lead AI agent evaluation systems, including offline and online evaluation pipelines, golden datasets, synthetic data generation, human- and LLM-based judging, and continuous quality monitoring.
  5. Drive LLM fine-tuning and alignment efforts to improve domain-specific reasoning, accuracy, and reliability for security and observability use cases.

Skills

Required

  • Python
  • LLMs
  • prompt engineering
  • context engineering
  • agentic AI design patterns
  • reasoning workflows
  • machine learning
  • distributed systems
  • data pipelines
  • large-scale system design
  • evaluation frameworks for ML/LLM systems

Nice to have

  • LangGraph
  • LangChain
  • CrewAI
  • Anthropic
  • OpenAI
  • AWS Bedrock
  • Vertex AI
  • SFT
  • RLHF
  • RLAIF
  • preference learning
  • domain adaptation
  • LLMOps
  • inference optimization
  • latency/cost management
  • observability
  • production monitoring
  • PyTorch
  • MLflow
  • Airflow
  • Docker
  • Kubernetes
  • AWS
  • GCP
  • Azure
  • AI/ML to security
  • observability
  • large-scale log/telemetry data

What the JD emphasized

  • production ML/AI systems
  • agentic AI design patterns
  • evaluation frameworks for ML/LLM systems
  • agentic AI systems
  • multi-agent architectures
  • LLM fine-tuning pipelines

Other signals

  • Agentic AI systems
  • Multi-agent architectures
  • LLM fine-tuning
  • Production AI infrastructure