Sr Specialist Quality/m&p/process - AI … at AT&T

What you'd actually do

Monitor the performance of Agentic Capabilities as they autonomously process various ticket types.

Review fallout tickets (cases where AI agents cannot resolve issues) within a workflow management tool.

Analyze fallout patterns to identify knowledge gaps, process inefficiencies, or opportunities for AI improvement.

Develop and implement training protocols to enhance Agentic Capabilities, leveraging prompt engineering, model validation, and knowledge base updates.

Validate AI performance through systematic review, testing, and user/stakeholder feedback.

Skills

Required

Understanding of the business function and M&Ps; align AI behavior with policy and process; plus knowledge of supported workflows and tools with experience operating the systems involved (e.g., ticketing/workflow platforms).
Strong analytical and problem-solving skills with meticulous attention to detail; proven root cause analysis (RCA) capability.
General AI literacy and understanding of agentic systems; basic prompt engineering (iteration, testing, versioning).
Ability to manage fallout within SLAs, triage tickets, and drive rapid resolution; strong prioritization in fast-paced environments.
Observability: proficiency with logs, metrics, dashboards, and alerts; define and track quality KPIs (accuracy, fallout rate, MTTR).
Basic scripting understanding to automate corrections, content re-ingestion, and validation workflows.
Knowledge base authoring and maintenance; clear documentation of training methods, resolutions, and changes for auditability.
Compliance/data privacy/ethical guidelines awareness; maintain auditable processes and change logs.
Effective communication: synthesize findings, report metrics, and present recommendations to stakeholders.

Nice to have

Advanced observability (distributed tracing, SLO/SLA design) and incident response practices.
Experiment tracking and ML operations tooling, feature flags, canary/rollback strategies.
Familiarity with fine-tuning pipelines, retrieval/RAG, vector databases, and content ingestion pipelines.
SQL/BI tools for advanced analytics and dashboarding; ability to build executive-ready reports.
Version control (Git) for prompts, KB content, and evaluation artifacts; change management discipline.
Workflow orchestration for scheduled re-ingestion, evaluations, and reporting.
Experience in training, quality assurance, documentation, or knowledge management, including taxonomy/ontology design.
Advanced scripting/automation and experience writing/maintaining Markdown-based runbooks and KB articles.
Prior experience with AI in production settings and A/B testing platforms.

This position requires office presence of a minimum of 5 days per week and is only located in the location(s) posted. No relocation is offered.

At AT&T, we empower leaders to drive change in a fast-evolving, connected world. Your strategic vision will help serve customers and transform lives through innovative solutions and impactful connections.

The Sr Specialist Quality/M&P/Process - AI Training Manager is responsible for overseeing the training, validation, and continuous improvement of Agentic Capabilities—AI-powered agents designed to autonomously process a variety of ticket types within workflow management systems. This role owns AI agent quality: ensuring reliable, high-quality outcomes; rapidly reviewing and resolving exception (“fallout”) tickets; applying corrections and re-ingesting updates; improving training data and fine-tuning artifacts; updating the agent knowledge base; and rerunning tickets to validate fixes—continuously strengthening agentic capabilities.

Key Responsibilities

Agentic Capability Management

Monitor the performance of Agentic Capabilities as they autonomously process various ticket types.
Ensure seamless integration of AI agents into new or existing workflows, optimizing for efficiency and accuracy.

Fallout Review & Correction

Review fallout tickets (cases where AI agents cannot resolve issues) within a workflow management tool.
Diagnose root causes, make necessary corrections, and re-ingest updated information to the AI system.
Ensure all fallout tickets are actioned within a 48-hour window; unresolved tickets revert to the human-worked queue.

Training & Continuous Improvement

Analyze fallout patterns to identify knowledge gaps, process inefficiencies, or opportunities for AI improvement.
Develop and implement training protocols to enhance Agentic Capabilities, leveraging prompt engineering, model validation, and knowledge base updates.
Collaborate with cross-functional teams (product, engineering, support) to align AI behaviors with business needs and compliance requirements.

Knowledge Base & Documentation

Maintain and update the agent knowledge base, ensuring accurate, current, and comprehensive content for AI agents.
Document training methodologies, ticket resolutions, and process improvements for knowledge sharing and auditing.

Quality Assurance & Compliance

Validate AI performance through systematic review, testing, and user/stakeholder feedback.
Ensure all processes comply with regulatory standards, ethical guidelines, and company policies.

Reporting & Communication

Track and report on key metrics: ticket resolution rates, fallout frequency, review turnaround times, and AI improvement outcomes.
Continuous Improvement practices
Communicate insights, best practices, and recommendations to stakeholders and leadership.

Skills & Qualifications

Understanding of the business function and M&Ps; align AI behavior with policy and process; plus knowledge of supported workflows and tools with experience operating the systems involved (e.g., ticketing/workflow platforms).
Strong analytical and problem-solving skills with meticulous attention to detail; proven root cause analysis (RCA) capability.
General AI literacy and understanding of agentic systems; basic prompt engineering (iteration, testing, versioning).
Ability to manage fallout within SLAs, triage tickets, and drive rapid resolution; strong prioritization in fast-paced environments.
Observability: proficiency with logs, metrics, dashboards, and alerts; define and track quality KPIs (accuracy, fallout rate, MTTR).
Basic scripting understanding to automate corrections, content re-ingestion, and validation workflows.
Knowledge base authoring and maintenance; clear documentation of training methods, resolutions, and changes for auditability.
Compliance/data privacy/ethical guidelines awareness; maintain auditable processes and change logs.
Effective communication: synthesize findings, report metrics, and present recommendations to stakeholders.

Preferred Skills

Advanced observability (distributed tracing, SLO/SLA design) and incident response practices.
Experiment tracking and ML operations tooling, feature flags, canary/rollback strategies.
Familiarity with fine-tuning pipelines, retrieval/RAG, vector databases, and content ingestion pipelines.
SQL/BI tools for advanced analytics and dashboarding; ability to build executive-ready reports.
Version control (Git) for prompts, KB content, and evaluation artifacts; change management discipline.
Workflow orchestration for scheduled re-ingestion, evaluations, and reporting.
Experience in training, quality assurance, documentation, or knowledge management, including taxonomy/ontology design.
Advanced scripting/automation and experience writing/maintaining Markdown-based runbooks and KB articles.
Prior experience with AI in production settings and A/B testing platforms.

Job Contribution: An experienced professional with in-depth knowledge, applying organizational practices to resolve moderately difficult problems. Works with independent judgement on expansive projects with minimal supervision, implementing policy changes to improve functions. Actions impact efficiency costs, schedules and client relationships. Interacts primarily within the department and with General Managers and above across various teams.

Supervisor: No

Education/Experience: Bachelor’s degree (BS/BA) desired. 2+ years of related experience. Certification is required in some areas.

Our Sr Specialist Quality/M&M/Process - AI Training Manager, earns between $87,200 - $130,800. Not to mention all the other amazing rewards that working at AT&T offers. Individual starting salary within this range may depend on geography, experience, expertise, and education/training.

Joining our team comes with amazing perks and benefits0

Medical/Dental/Vision coverage
401(k) plan
Tuition reimbursement program
Paid Time Off and Holidays (based on date of hire, at least 23 days of vacation each year and 9 company-designated holidays)
Paid Parental Leave
Paid Caregiver Leave
Additional sick leave beyond what state and local law require may be available but is unprotected
Adoption Reimbursement
Disability Benefits (short term and long term)
Life and Accidental Death Insurance
Supplemental benefit programs: 8critical illness/accident hospital indemnity/group legal
Employee Assistance Programs (EAP)
Extensive employee wellness programs
Employee discounts up to 50% off on eligible AT&T mobility plans and accessories,
AT&T internet (and fiber where available) and AT&T phone

If you’re ready to make an impact on our business and your career, bring your bold ideas to a world of possibility.

Apply today!

Weekly Hours:

Time Type:

Regular

Location:

Dallas, Texas, Richardson, Texas

**Salary Range: **

$87,200.00 - $130,800.00

It is the policy of AT&T to provide equal employment opportunity (EEO) to all persons regardless of age, color, national origin, citizenship status, physical or mental disability, race, religion, creed, gender, sex, sexual orientation, gender identity and/or expression, genetic information, marital status, status with regard to public assistance, veteran status, or any other characteristic protected by federal, state or local law. In addition, AT&T will provide reasonable accommodations for qualified individuals with disabilities. AT&T is a fair chance employer and does not initiate a background check until an offer is made.