Software Engineer, AI Gateway

Vercel Vercel · Enterprise · United States · Remote · Engineering

Software Engineer to build and enhance an AI Gateway platform that provides a unified API for accessing multiple AI models from various providers. The role focuses on developing reliable, low-latency systems with features like rate limiting, intelligent failovers, and seamless integrations to ensure production-ready reliability for AI workloads.

What you'd actually do

  1. Contribute to the design, implementation, and maintenance of the AI Gateway platform, emphasizing features like unified API endpoints, rate limit management, and intelligent failover mechanisms to boost uptime and reliability.
  2. Write clean, efficient, and well-documented code, conducting thorough testing to ensure low-latency responses and stability for high-volume AI inference requests.
  3. Collaborate with cross-functional teams, including product managers, AI researchers, and infrastructure engineers, to integrate new AI providers and models while addressing scalability and performance challenges.
  4. Engage with the open-source community, contribute to AI SDK and related projects, and align with Vercel's ethos of fostering developer tools.
  5. Gather user feedback and analytics to drive innovations in AI Gateway, such as enhanced billing unification, provider-agnostic authentication, and analytics for usage insights.

Skills

Required

  • JavaScript/TypeScript
  • backend development
  • APIs
  • cloud infrastructure
  • AI/ML integrations
  • API gateways
  • distributed systems
  • high-throughput services
  • rate limiting
  • caching
  • failovers

Nice to have

  • open source contributions

What the JD emphasized

  • 5+ years of relevant experience
  • Strong proficiency in JavaScript/TypeScript and experience with backend development, APIs, and cloud infrastructure
  • Experience with AI/ML integrations, API gateways, distributed systems, or handling high-throughput services (e.g., rate limiting, caching, failovers)
  • low-latency responses
  • high-volume AI inference requests

Other signals

  • unified API for accessing hundreds of AI models
  • low-latency systems
  • rate limiting
  • intelligent failovers
  • seamless integrations
  • production-ready reliability for AI workloads
  • automatic fallbacks during outages
  • consistent performance across providers