Sr Specialist Quality/m&p/process - AI Training Manager

AT&T AT&T · Telecom · Dallas, TX +1

This role focuses on managing the quality and continuous improvement of AI-powered agents that autonomously process tickets within workflow systems. Responsibilities include monitoring agent performance, reviewing and correcting fallout tickets, improving training data and knowledge bases, and ensuring compliance with business needs and regulatory standards. The role involves analyzing fallout patterns, developing training protocols, and reporting on key performance metrics.

What you'd actually do

  1. Monitor the performance of Agentic Capabilities as they autonomously process various ticket types.
  2. Review fallout tickets (cases where AI agents cannot resolve issues) within a workflow management tool.
  3. Analyze fallout patterns to identify knowledge gaps, process inefficiencies, or opportunities for AI improvement.
  4. Develop and implement training protocols to enhance Agentic Capabilities, leveraging prompt engineering, model validation, and knowledge base updates.
  5. Validate AI performance through systematic review, testing, and user/stakeholder feedback.

Skills

Required

  • Understanding of the business function and M&Ps; align AI behavior with policy and process; plus knowledge of supported workflows and tools with experience operating the systems involved (e.g., ticketing/workflow platforms).
  • Strong analytical and problem-solving skills with meticulous attention to detail; proven root cause analysis (RCA) capability.
  • General AI literacy and understanding of agentic systems; basic prompt engineering (iteration, testing, versioning).
  • Ability to manage fallout within SLAs, triage tickets, and drive rapid resolution; strong prioritization in fast-paced environments.
  • Observability: proficiency with logs, metrics, dashboards, and alerts; define and track quality KPIs (accuracy, fallout rate, MTTR).
  • Basic scripting understanding to automate corrections, content re-ingestion, and validation workflows.
  • Knowledge base authoring and maintenance; clear documentation of training methods, resolutions, and changes for auditability.
  • Compliance/data privacy/ethical guidelines awareness; maintain auditable processes and change logs.
  • Effective communication: synthesize findings, report metrics, and present recommendations to stakeholders.

Nice to have

  • Advanced observability (distributed tracing, SLO/SLA design) and incident response practices.
  • Experiment tracking and ML operations tooling, feature flags, canary/rollback strategies.
  • Familiarity with fine-tuning pipelines, retrieval/RAG, vector databases, and content ingestion pipelines.
  • SQL/BI tools for advanced analytics and dashboarding; ability to build executive-ready reports.
  • Version control (Git) for prompts, KB content, and evaluation artifacts; change management discipline.
  • Workflow orchestration for scheduled re-ingestion, evaluations, and reporting.
  • Experience in training, quality assurance, documentation, or knowledge management, including taxonomy/ontology design.
  • Advanced scripting/automation and experience writing/maintaining Markdown-based runbooks and KB articles.
  • Prior experience with AI in production settings and A/B testing platforms.

What the JD emphasized

  • 48-hour window
  • fallout tickets
  • AI improvement
  • compliance requirements
  • quality KPIs

Other signals

  • AI-powered agents
  • autonomous processing
  • workflow management systems
  • AI agent quality
  • continuous improvement