Senior Staff Genai Engineer - Application Performance Monitoring (apm)

Datadog Datadog · Enterprise · New York, NY · Dev Eng

Senior Staff GenAI Engineer focused on agentic workflows within Datadog's APM product. The role involves leading the design, training, evaluation, and deployment of GenAI/ML models at scale, with a strong emphasis on product-mindedness and end-to-end system ownership.

What you'd actually do

  1. Serve as the technical owner for GenAI initiatives within APM, leading design, development, and deployment of ML/AI-powered features across multiple teams.
  2. Guide long-term strategy and technical direction for GenAI workflows across APM and related products.
  3. Build and benchmark GenAI/ML models using state-of-the-art techniques.
  4. Collaborate with cross-functional teams to build automated investigation and triaging tools.
  5. Influence product direction by bringing a strong product mindset to your work, always advocating for the end user.

Skills

Required

  • Deep experience in GenAI/ML
  • Experience in agentic workflows
  • Product-minded ML engineer
  • Strong technical expertise
  • Excellent communication skills
  • Track record of driving impactful initiatives end to end
  • Proven track record of leading large-scale GenAI/ML initiatives in a product-driven environment
  • Highly proficient in model development, deployment, evaluation, and optimization
  • Hands-on experience in building end-to-end ML systems
  • Ability to drive initiatives across cross-functional teams
  • Solve ambiguous challenges
  • BS/MS/PhD in a scientific field or equivalent experience
  • 10+ years of relevant engineering experience
  • 4+ years leading cross-team technical initiatives

Nice to have

  • Mentoring engineers
  • Thought leadership
  • Use AI coding tools
  • Validate, critique, and refine AI-generated output
  • Push the boundaries of how AI can improve software engineering best practices
  • Contribute to building AI-enabled products

What the JD emphasized

  • leading design, development, and deployment
  • leading large-scale GenAI/ML initiatives
  • building end-to-end ML systems
  • drive initiatives across cross-functional teams

Other signals

  • GenAI/ML
  • agentic workflows
  • deploy GenAI/ML models at scale
  • end-to-end ML systems