Data Scientist - National Federal Tax Services

This role focuses on building and optimizing LLM-powered prototypes and production-ready components using Generative AI (GenAI) and Natural Language Processing (NLP). Responsibilities include designing prompts and RAG workflows, packaging models as APIs, supporting cloud deployment, and ensuring outputs are evaluated, monitored, and aligned with responsible AI expectations. Requires strong Python proficiency, GenAI framework experience, and communication skills.

What you'd actually do

  1. Build and optimize GenAI/NLP models and prototypes using modern framework e.g., OpenAI, Hugging Face).
  2. Design, test and refine prompts to improve quality and reduce risk (e.g., hallucinations, bias, unsafe outputs).
  3. Implement foundational Retrieval-Augmented Generation pipelines to enable context-aware applications.
  4. Package models as APIs and support deployments on AWS (Amazon Web Services, Azure, or GCP (Google Cloud Platforms).
  5. Develop evaluation approaches and monitor model outputs for reliability, performance drift, and compliance needs.

Skills

Required

  • Python proficiency
  • GenAI frameworks (LangChain, OpenAI/GPT APIs, Hugging Face)
  • software engineering best practices (Git, CI/CD, automated testing)
  • R, Python, SQL
  • cloud ML deployment (Azure, AWS, GCP)

Nice to have

  • fine-tuning LLMs
  • building conversational AI agents
  • Responsible AI, privacy, security, and AI ethics considerations
  • data engineering
  • MLOps

What the JD emphasized

  • GenAI and Natural Language Processing (NLP) skill
  • build and optimize LLM-powered prototypes and production-ready components
  • design prompts and Retrieval-Augmented Generation (RAG) workflows
  • package models as APIs
  • support cloud deployment
  • outputs are evaluated, monitored, and aligned to responsible AI expectations
  • hand-on Python development
  • solid statistical modeling fundamentals
  • communicate clearly across technical and non-technical stakeholders
  • 1+ year hands-on building LLM/NLP solutions
  • Strong Python proficiency and hands-on experience with GenAI frameworks such as LangChain, OpenAI/GPT APIs and Hugging Face
  • Experience fine-tuning LLMs and/or building conversational AI agents.

Other signals

  • LLM-powered prototypes
  • production-ready components
  • GenAI and NLP skill
  • design prompts and RAG workflows
  • package models as APIs
  • cloud deployment
  • evaluated, monitored, and aligned to responsible AI expectations
  • hand-on Python development
  • solid statistical modeling fundamentals
  • communicate clearly across technical and non-technical stakeholders