Speech recognition, synthesis, and audio understanding — TTS, ASR, voice agents, and audio-native LLMs. Primary AI lifecycle stage: application.
301 active AI roles across 54 companies in our index reference Audio & speech as of today.
The companies with the most active Audio & speech listings are: Google (62 roles), Amazon (49 roles), xAI (32 roles), Meta (12 roles), Apple (10 roles).
Audio & speech primarily belongs to the application stage of the AI lifecycle. In current hiring, Audio & speech roles concentrate at: post-training (24%), agents (20%).
The sectors with the most active Audio & speech hiring are: Big Tech, AI Frontier, Enterprise.
Speech recognition, synthesis, and audio understanding — TTS, ASR, voice agents, and audio-native LLMs.
Primary AI lifecycle stage: application.
As of today, 301 active AI roles across 54 companies in our index reference Audio & speech. Hiring concentrates at the post-training (24%) and agents (20%) stages. Most common sectors: Big Tech, AI Frontier, Enterprise.
14 AI roles tagged audio_speech.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| DoorDash | AI Research Fellowship, (Summer and Fall 2026) | Consumer | 9 | Agent orchestration · Tool use · Evals · Forecasting · Multimodal · Vision · Frontier research · Synthetic data |
| Uber | Senior Staff Machine Learning Engineer – Moonshot AI | Consumer | 9 | Multimodal · Vision · LLM observability · Evals · Fine-tuning · RAG · Model serving · Recommender systems |
| Zillow | Principal Machine Learning Engineer, Agentic AI | Consumer | 9 | Agent orchestration · Multimodal · Agent research · Model serving · Inference infra |
| Uber | Staff ML Engineer, Generative AI | Consumer | 9 | Agent orchestration · Tool use · Evals · Guardrails · LLM observability · RAG · Fine-tuning · Model serving · Multimodal |
| Spotify | Senior Research Scientist - Music | Consumer | 9 | Fine-tuning · Multimodal |
| Snap | Machine Learning Engineer, Generative ML , Level 5 | Consumer | 8 | Multimodal · Inference infra · Model serving |
| Instacart | Senior Software Engineer II, AI Labs & Foundations | Consumer | 8 | Agent orchestration · RAG · Vector DB · Model serving · LLM observability |
| Spotify | Staff Machine Learning Engineer - Content Intelligence | Consumer | 8 | Multimodal · Vision · Fine-tuning · Model serving |
| Uber | Sr. Staff Engineer (Conversational/Voice AI) | Consumer | 8 | Agent orchestration · Tool use · Evals · Guardrails · LLM observability · RAG · Model serving · Multimodal |
| Spotify | Senior Machine Learning Engineer - Content Intelligence | Consumer | 7 | Vision · Multimodal · Fine-tuning · Model serving |
| Spotify | Senior Machine Learning Engineer - Enrichment & Content Intelligence | Consumer | 7 | Multimodal · Vision |
| Discord | Manager, Scaled Abuse Countermeasures and Research | Consumer | 7 | Agent orchestration · Evals · Guardrails · LLM observability · RAG · Vector DB · Fine-tuning · Model serving · Recommender systems · Search & ranking · Vision · Frontier research · Interpretability · Synthetic data · Agent research · RL post-training · RLHF · Reward modeling · RL robotics · Embodied AI |
| Whatnot | Software Engineer, Trust & Risk | Consumer | 7 | Agent orchestration · Evals · Guardrails · LLM observability · RAG · Vector DB · Fine-tuning · Inference infra · Model serving · Recommender systems · Search & ranking · Vision · Frontier research · Interpretability · Synthetic data · Agent research · RL post-training · RLHF · Reward modeling · RL robotics · Embodied AI |
| Uber | Senior Software Engineer - Communications Platform (Backend) | Consumer | 5 | Agent orchestration |