Software Engineer, Model Routing & Inference

Cursor Cursor · Coding AI · New York, NY · Engineering

Software Engineer on the Model Routing & Inference team responsible for building and evolving the inference platform that powers all AI interactions in the product, focusing on speed, reliability, and cost-effectiveness at scale.

What you'd actually do

  1. Build the inference platform that powers every AI interaction in the product.
  2. Own the full inference path: making Cursor's AI faster, more reliable, and more cost-effective at a scale few teams in the world get to operate at.
  3. Build and evolve our inference gateway, a single abstraction over every provider's API semantics, so model onboarding becomes a config change.
  4. Design intelligent cross-provider failover so no single provider outage causes user-visible degradation.
  5. Design routing backpressure and admission control so traffic spikes don't cascade into providers.

Skills

Required

  • building high-throughput, low-latency distributed systems
  • inference serving
  • traffic routing
  • real-time data pipelines
  • cost/performance tradeoffs at scale
  • GPU utilization
  • provider economics
  • capacity planning
  • strong software engineering fundamentals
  • shipping production systems

Nice to have

  • reasoning about reliability, cost, latency, and user experience

What the JD emphasized

  • high-throughput, low-latency distributed systems
  • inference serving
  • millions of requests

Other signals

  • inference platform
  • high-throughput, low-latency distributed systems
  • millions of requests