Senior Software Engineer - Agentic AI Applications

Chegg Chegg · Consumer · Delhi, India · Remote

Senior Software Engineer to design, build, and own production-grade agentic AI applications for Chegg Skills. This full-stack role involves working across multi-agent orchestration, retrieval systems, evaluation pipelines, LLM application engineering, and learner-facing experiences. The engineer will lead backend systems and AI engineering while also shipping React interfaces, moving between model experimentation, production engineering, and product iteration. The team is building the first version of the Agentic Skills Framework, with a focus on evolving the platform over years.

What you'd actually do

  1. Architect and ship production agentic AI applications — multi-step workflows, tool use, structured outputs, territory-restricted reasoning, and confidence-scored validation loops.
  2. Design and improve retrieval systems across multiple modes: catalog-level retrieval for curriculum matching, content-level RAG for in-lesson Q&A, and lexical or graph approaches where vector search underperforms.
  3. Build the LLM application layer: prompt orchestration, structured outputs, function/tool calling, model routing via the AI gateway, conversation memory, and context management.
  4. Ship learner-facing AI experiences (tutoring, practice, evaluation, coach) and operator-facing tools (program authoring, evaluation dashboards, admin) end-to-end in React + TypeScript.
  5. Build evaluation infrastructure that makes AI quality measurable — gold datasets, rubrics, regression pipelines, threshold calibration, factuality and groundedness metrics, A/B test scaffolding.

Skills

Required

  • 7+ years of professional software engineering experience
  • strong full-stack depth across backend services and product surfaces
  • Production experience with Python (FastAPI/Flask) for building scalable AI services and APIs
  • Production experience with React + TypeScript building real product surfaces
  • Hands-on production experience building LLM applications — prompt engineering, structured outputs (JSON schemas, function/tool calling), conversation memory, and evaluation
  • Practical experience with agentic AI patterns: multi-step orchestration, ReAct or planner-style agents, custom tool integration, iteration limits, and confidence scoring
  • Production RAG experience end-to-end: chunking strategies, embeddings, vector databases (Milvus/Zilliz, Pinecone, pgvector, or similar), threshold calibration, hybrid/lexical retrieval, grounding, and hallucination mitigation
  • Experience designing AI product surfaces that handle streaming responses, partial outputs, loading states, tool-call visualization, and recoverable errors — with care for learner experience
  • Experience working across multiple LLM providers (OpenAI, Anthropic, Google) with a clear sense of cost, latency, quality, and context trade-offs
  • AI safety and guardrails experience

Nice to have

  • familiarity with Kotlin/Java is a plus

What the JD emphasized

  • production-grade agentic AI applications
  • full-stack with AI depth
  • production engineering
  • product iteration
  • long-arc role
  • shape what comes next
  • production experience with Python
  • production experience with React + TypeScript
  • Hands-on production experience building LLM applications
  • Practical experience with agentic AI patterns
  • Production RAG experience end-to-end
  • Experience designing AI product surfaces
  • Experience working across multiple LLM providers
  • AI safety and guardrails experien

Other signals

  • building agentic AI applications
  • full-stack with AI depth
  • production engineering
  • product iteration
  • long-arc role
  • shape what comes next