Staff AI Engineer United States |remote

Grafana Labs Grafana Labs · Data AI · United States · Remote · Sales Operations

Staff AI Engineer to own the AI agent infrastructure and automation platform powering GTM teams. Build multi-agent architectures, LLM integrations, and backend services connecting AI models to internal and third-party data platforms. Ship production systems, define technical direction, and partner with various teams to build scalable, self-service automation. Responsibilities include end-to-end development of multi-agent AI systems, building modular agentic systems, developing reusable agentic skills, implementing observability and feedback loops, establishing governance, building MCP servers and APIs, architecting RAG data flows, and building serverless/containerized services. Also involves partnering with RevOps and Finance, designing workflows, and building self-service systems. Requires 8+ years of software engineering experience, 2+ years applying LLMs/AI to production, proficiency in Python/JavaScript, experience with LLM frameworks (prompt engineering, RAG, function calling), building multi-agent systems, GCP, BigQuery, and understanding LLM failure modes.

What you'd actually do

  1. Own end-to-end development of multi-agent AI systems, from architecture and implementation through testing, deployment, and ongoing operation
  2. Build modular, composable agentic systems using orchestration frameworks (LangChain, CrewAI, Anthropic MCP, or similar) that operate 24/7 across teams
  3. Develop reusable agentic skills that agents invoke across interfaces (Slack, dashboards, internal apps, CLIs)
  4. Implement observability and feedback loops including logging, performance metrics, prompt iteration, model evaluation, and cost management
  5. Establish governance and compliance standards for AI workflows including access controls, audit trails, PII handling, and human-in-the-loop escalation paths

Skills

Required

  • 8+ years of software engineering experience with depth in backend development, systems integration, or data/analytics engineering
  • 2+ years hands-on experience applying LLMs/AI to production workflows, not just prototypes
  • Strong proficiency in Python and JavaScript/Node.js with Git-based workflows, code review practices, and testing discipline
  • Hands-on experience with LLM frameworks and patterns including prompt engineering, RAG, function calling/tool use, structured output parsing, and evaluation
  • Experience building and operating multi-agent systems at scale including agent decomposition, orchestration patterns (sequential chains, router/dispatcher, parallel fan-out), state management, and production monitoring
  • Deep familiarity with Google Cloud Platform, BigQuery, and serverless/containerized services (Cloud Functions, Cloud Run)
  • Understanding of LLM failure modes and production mitigations including confidence thresholds, fallback logic, human escalation, and cost/latency management
  • Proven ability to identify high-leverage problems, push back on low-impact requests, and deliver end-to-end with minimal direction
  • Fluent with AI-assisted development tools (GitHub Copilot, Cursor, Claude Code). You use AI to build AI systems
  • Clear technical communicator—you can explain complex systems in simple terms to both engineers and business stakeholders

Nice to have

  • LangChain, CrewAI, Anthropic MCP, or similar orchestration frameworks
  • Slack, dashboards, internal apps, CLIs
  • n8n, Workato, or custom platforms
  • Grafana's cloud infrastructure

What the JD emphasized

  • own the AI agent infrastructure and automation platform
  • ship production systems
  • own the technical direction
  • define the technical direction
  • Own end-to-end development of multi-agent AI systems
  • Build modular, composable agentic systems
  • Develop reusable agentic skills
  • Implement observability and feedback loops
  • Establish governance and compliance standards for AI workflows
  • Build MCP servers, APIs, CLIs, and microservices
  • Architect data flows for retrieval-augmented generation (RAG)
  • Build serverless or containerized services
  • Partner with RevOps, and Finance to build solutions with measurable business outcomes
  • Design and deploy workflows
  • Build systems designed for self-service
  • 2+ years hands-on experience applying LLMs/AI to production workflows, not just prototypes
  • Experience building and operating multi-agent systems at scale
  • diagnose business problems before writing code
  • think in workflows and outcomes, not just functions
  • Understanding of LLM failure modes and production mitigations
  • Proven ability to identify high-leverage problems
  • deliver end-to-end with minimal direction

Other signals

  • building multi-agent architectures
  • LLM integrations
  • backend services that connect AI models to internal and third-party data platforms
  • ship production systems that teams depend on daily
  • own the technical direction
  • define the technical direction for the automation platform
  • partner with Data Engineering, GTM Systems, Field Operations, and GTM leadership to build scalable, self-service automation
  • Own end-to-end development of multi-agent AI systems
  • Build modular, composable agentic systems using orchestration frameworks
  • Develop reusable agentic skills
  • Implement observability and feedback loops
  • Establish governance and compliance standards for AI workflows
  • Build MCP servers, APIs, CLIs, and microservices connecting AI models to business systems
  • Architect data flows for retrieval-augmented generation (RAG)
  • Build serverless or containerized services
  • Partner with RevOps, and Finance to build solutions with measurable business outcomes
  • Design and deploy workflows using orchestration tools
  • Build systems designed for self-service
  • apply LLMs/AI to production workflows
  • building and operating multi-agent systems at scale
  • diagnose business problems before writing code
  • think in workflows and outcomes
  • Understanding of LLM failure modes and production mitigations
  • identify high-leverage problems
  • deliver end-to-end with minimal direction