Software Engineer

Cognition Cognition · Coding AI · San Francisco, CA · Research & Development

Software Engineer role focused on building core agent infrastructure for AI software agents like Devin and Windsurf. Responsibilities include designing and shipping systems for tool use, context management, multi-step planning, subagent orchestration, and sandboxed code execution. Also involves improving the AI-native IDE, translating model capabilities into features, and ensuring reliability and performance at scale.

What you'd actually do

  1. Build core agent infrastructure: Design and ship the systems that power Devin's long-horizon task execution: tool use, context management, multi-step planning, subagent orchestration, and sandboxed code execution environments.
  2. Improve Windsurf as an AI-native IDE: Contribute to editor intelligence, agent-in-the-loop workflows, real-time code understanding, and the developer experience that makes Windsurf different from every other IDE.
  3. Close the loop between models and products: Work directly with researchers to translate new model capabilities into shipped features; your feedback shapes what gets prioritized in training.
  4. Own reliability and performance at scale: Build systems that handle millions of agentic tasks with low latency, high reliability, and the kind of correctness that developers depend on in production.
  5. Move the category forward: Cognition is defining what AI software engineering looks like. You will have real input into what gets built next and why.

Skills

Required

  • Python proficiency
  • building reliable, performant distributed systems
  • experience shipping products
  • experience with LLMs and AI agents

Nice to have

  • experience at a frontier AI lab
  • experience at an applied AI company
  • experience at a developer tools company
  • competitive programming background
  • experience with agent orchestration
  • experience with tool use
  • experience with context management
  • experience with multi-step planning
  • experience with subagent orchestration
  • experience with sandboxed code execution environments
  • experience with editor intelligence
  • experience with agent-in-the-loop workflows
  • experience with real-time code understanding
  • experience with low-latency systems
  • experience with high-reliability systems

What the JD emphasized

  • hardest open problems in applied AI
  • reason across thousands of lines of code
  • spawn and coordinate subagents
  • use tools reliably across ambiguous long-horizon tasks
  • real engineer would trust
  • millions of developers use
  • move fast without cutting corners
  • systems that power Devin's long-horizon task execution
  • tool use
  • context management
  • multi-step planning
  • subagent orchestration
  • sandboxed code execution environments
  • agent-in-the-loop workflows
  • real-time code understanding
  • millions of agentic tasks
  • low latency
  • high reliability
  • correctness that developers depend on in production
  • defining what AI software engineering looks like
  • Systems engineering depth
  • building reliable, performant distributed systems
  • strong opinions about correctness, failure modes, and production behavior
  • shipped things that real people depend on
  • make progress on hard problems with incomplete specs
  • learn fast from results
  • course-correct without needing a lot of direction
  • shipping quickly
  • code quality
  • dug into how LLMs work
  • how agents fail
  • make AI-powered systems behave reliably in the real world
  • Python is the primary language
  • own large Python codebases in production
  • frontier AI lab
  • applied AI company
  • developer tools company

Other signals

  • building end-to-end software agents
  • shipping systems that go directly into Devin and Windsurf
  • defining what AI software engineering looks like