Development Engineer in Test (sdet) – ML & LLM Systems

Comcast Comcast · Media · Washington, DC

This role focuses on evaluating, validating, and measuring LLM behavior within NLP pipelines and ML quality frameworks. The engineer will design and implement automated test strategies and frameworks for ML models, NLP systems, and backend services, including model validation, benchmarking, and drift detection. Experience with LLM evaluation frameworks and testing ML models is required.

What you'd actually do

  1. Design, develop, and maintain robust automated test code using software engineering best practices, ensuring reusability, reliability, and scalability.
  2. Design and implement robust test strategies and automation frameworks to support evolving services, workflows, and ML model testing requirements.
  3. Perform model validation and verification, including accuracy benchmarking, regression testing, drift detection, and reproducibility checks before and after deployment.
  4. Develop and enhance automated test suites for functional, regression, and performance testing across machine learning models, NLP systems, and supporting backend services.
  5. Investigate failing tests, diagnose root causes, and implement fixes to improve test stability and confidence.

Skills

Required

  • Software Development Engineer in Test (SDET)
  • API testing
  • RESTful APIs
  • test processes
  • test design
  • test case creation
  • defect management systems
  • Java
  • Spring Boot
  • Python
  • testing Machine Learning models
  • NLP services
  • Large Language Models (LLMs)
  • LLM evaluation frameworks
  • test automation frameworks
  • Linux
  • Docker
  • Kubernetes
  • CI/CD pipelines
  • Jenkins
  • AWS services
  • S3
  • NoSQL databases
  • MongoDB
  • Agile development methodologies
  • Scrum teams
  • written and verbal communication skills
  • collaboration

Nice to have

  • Comcast experience

What the JD emphasized

  • LLM evaluations
  • ML quality frameworks
  • evaluating, validating, and measuring LLM behavior
  • LLM evaluation frameworks

Other signals

  • evaluating, validating, and measuring LLM behavior
  • ensuring model reliability, performance, and responsible deployment
  • LLM evaluations
  • ML quality frameworks