Senior Software Engineer, Agent Orchestration

Decagon Decagon · Vertical AI · New York, NY · Engineering

Senior Software Engineer to design and build the runtime and model orchestration layer for conversational AI agents. This role focuses on the agent harness, which handles routing, execution logic, tool orchestration, and control-plane systems for live conversations. The work involves optimizing for latency and reliability, analyzing failures, and iterating quickly to improve agent performance in production.

What you'd actually do

  1. Design and evolve agent harnesses that power different product experiences
  2. Build core runtime systems, including AOP execution and multi-model orchestration
  3. Develop control-plane logic for routing, planning, and tool invocation with strong safety guarantees
  4. Optimize agent systems for latency, reliability, and production correctness
  5. Analyze real-world failures and use data to drive iterative improvements

Skills

Required

  • Strong experience building distributed systems or backend platforms in production environments
  • Comfort working in ambiguous, fast-moving environments with rapid iteration cycles
  • Experience owning systems end-to-end, from design through production and iteration
  • Familiarity with experimentation, evaluation, or data-driven product improvement loops
  • A track record of improving system reliability, performance, and observability
  • Ability to debug complex systems and identify root causes of failures

Nice to have

  • You’ve built or worked on agent harnesses, orchestration layers, or execution frameworks
  • You think in terms of control planes, feedback loops, and system-level optimization, not just features
  • You’re excited about diagnosing failure modes and iterating toward measurable improvements
  • You care deeply about production quality—not just making systems work, but making them reliable, safe, and scalable
  • You’re motivated by pushing the frontier of how intelligent systems behave in the real world

What the JD emphasized

  • strong safety guarantees
  • real-world failures
  • offline evaluation
  • online experimentation
  • production correctness
  • reliability
  • latency
  • observability
  • testing
  • simulation systems
  • voice and real-time systems
  • turn-taking
  • latency improvements
  • agent harnesses
  • orchestration layers
  • execution frameworks
  • control planes
  • feedback loops
  • system-level optimization
  • diagnosing failure modes
  • measurable improvements
  • production quality
  • reliable
  • safe
  • scalable
  • push the frontier

Other signals

  • agent orchestration
  • runtime systems
  • multi-model orchestration
  • control-plane logic
  • experimentation platforms
  • real-time systems
  • voice interactions