Linguistic Engineer

Meta Meta · Big Tech · Redmond, WA +1

Linguistic Engineer focused on building datasets, pipelines, and models for ML applications within a multimodal assistant product. The role supports ASR/TTS, NLU, NLG, Dialog, LLMs, and Knowledge Graph components, requiring experience with data tools, pipelines, and analytics, as well as programming and text analysis. The position emphasizes multilingual dataset development and evaluation, with opportunities to collaborate directly with ML teams.

What you'd actually do

  1. Build datasets, pipelines, and models for ML applications
  2. Directly support product development with rules, prompts, and data patches
  3. Evaluate the quality of models and product experiences and close the feedback loop
  4. Clearly communicate with project stakeholders
  5. Identify best practices and improve procedures across data systems

Skills

Required

  • Python
  • SQL
  • text analysis
  • scripting
  • relational database
  • No SQL databases
  • programming
  • data analysis
  • version control
  • unit tests
  • programming best practices
  • multilingual dataset development
  • responsible, ethical AI practices
  • AI tools integration

Nice to have

  • ML practitioner
  • computational linguistics
  • NLP
  • conversational AI
  • prompt/context engineering
  • agent orchestration
  • knowledge graph integrations with lexicons or ontologies
  • PHP/Hack

What the JD emphasized

  • Experience as an ML practitioner is useful but not required as the role focuses on the datasets
  • Experience with data tools, pipelines, and analytics is needed for this role
  • The ideal candidate will be multi-lingual, as the role involves building and evaluating datasets across multiple languages for the voice assistant product
  • Demonstrated ongoing AI skill development (e.g., prompt/context engineering, agent orchestration) and staying current with emerging AI technologies
  • Experience adhering to and implementing responsible, ethical AI practices (e.g., risk assessment, bias mitigation, quality and accuracy reviews)

Other signals

  • datasets
  • models
  • pipelines
  • ML systems
  • multimodal assistant
  • ASR/TTS
  • NLU
  • NLG
  • Dialog
  • LLMs
  • Knowledge Graph
  • voice assistant product
  • computational linguistics
  • NLP
  • conversational AI