Which companies are hiring for Audio & speech roles?

The companies with the most active Audio & speech listings are: Google (62 roles), Amazon (49 roles), xAI (32 roles), Meta (12 roles), Apple (10 roles).

What AI lifecycle stage does Audio & speech belong to?

Audio & speech primarily belongs to the application stage of the AI lifecycle. In current hiring, Audio & speech roles concentrate at: post-training (24%), agents (20%).

What sectors invest most in Audio & speech?

The sectors with the most active Audio & speech hiring are: Big Tech, AI Frontier, Enterprise.

← Tag co-occurrence network

Audio & speech

Speech recognition, synthesis, and audio understanding — TTS, ASR, voice agents, and audio-native LLMs.

Primary AI lifecycle stage: application.

As of today, 301 active AI roles across 54 companies in our index reference Audio & speech. Hiring concentrates at the post-training (24%) and agents (20%) stages. Most common sectors: Big Tech, AI Frontier, Enterprise.

Top hiring:

Sector

All Big Tech · 268 AI Frontier · 94 Enterprise · 45 Semiconductors · 22 Vertical AI · 18 Consumer · 14 Data AI · 8 Banking · 8 Media · 6 Telecom · 4 Multimodal · 4 Pharma · 3 Hospitality · 2 Fintech · 2 Robotics · 1 Consulting · 1 Auto · 1

Function

All Engineering · 373 Research · 102 Product · 26

Status

All Active only

Sort

AI score Recently posted Company A–Z

FilteredsectorBig Tech×

268 AI roles tagged audio_speech.

Company	Title	Sector	AI score	Other tags
Meta	Staff Research Scientist, FAIR (RL / LLM's)	Big Tech	10	Frontier research · Pretraining · Vision · Multimodal · Code gen
Google	Senior Staff Software Engineer, Cognitive Architecture, Special Projects	Big Tech	10	Interpretability · Agent orchestration · Agent research · RL robotics · Model serving · Evals
Apple	AIML - Machine Learning Researcher, MLR	Big Tech	10	Frontier research · Pretraining · RL post-training · Multimodal
Meta	AI Research Scientist - Meta Superintelligence Labs	Big Tech	10	Frontier research · Pretraining · RL post-training · Fine-tuning · Evals
Meta	AI Research Scientist, Audio-Visual Understanding, FAIR	Big Tech	10	Multimodal · Vision · Frontier research · Evals
Meta	Research Scientist Intern, FAIR - Language & Multimodal Foundations (PhD)	Big Tech	10	Frontier research · Pretraining · Multimodal · Vision
Google	Senior Software Engineer, AI/ML, LLM Modeling	Big Tech	9	Fine-tuning · RL post-training · Model serving · Inference infra · Evals · LLM observability · Agent orchestration · Tool use · RAG
Google	Research Software Engineer	Big Tech	9	Model serving · Inference infra · LLM observability
Amazon	Principal Solutions Architect, Generative AI, AWS Industries, Telco	Big Tech	9	Agent orchestration · Tool use · RAG · Fine-tuning · Model serving · Multimodal · Guardrails
Meta	Research Scientist Intern, Photorealistic Telepresence (PhD)	Big Tech	9	Multimodal · Vision · Agent research · Frontier research · Fine-tuning
Google	Research Software Engineer, Multimodal AI	Big Tech	9	Agent orchestration · Multimodal · Vision · LLM observability · Fine-tuning · Evals
Google	Staff Software Engineer, On-Device Hybrid Multimodal AI	Big Tech	9	Agent orchestration · Multimodal · Model serving · Inference infra · Vision
Amazon	Applied Scientist II	Big Tech	9	Fine-tuning
Google	Applied AI Engineer, Audio, XR	Big Tech	9	Fine-tuning · Model serving · Inference infra
Google	Senior Software Engineer	Big Tech	9	Model serving · Inference infra · Multimodal · LLM observability
Google	Senior Technical Program Manager Lead, Gemini Audio, DeepMind	Big Tech	9	Evals · Model serving · Fine-tuning · Frontier research
Google	Gemini Audio Research Scientist, DeepMind	Big Tech	9	RL post-training · Evals · Multimodal
Amazon	Sr. Applied Scientist, Trust CX Innovations&AI Policy	Big Tech	9	Multimodal · Frontier research · Fine-tuning · Model serving
Google	Senior Staff Software Engineer, AI/ML, Applied AI	Big Tech	9	Agent orchestration · Multimodal · Model serving · Evals
Google	Research Scientist, Frontier Health, DeepMind	Big Tech	9	Agent orchestration · Multimodal · RL post-training · Reward modeling · Evals · Tool use · Vision
Amazon	Applied Scientist II, Amazon AWS Agentic AI, AWS AI Fundamental Research	Big Tech	9	Agent research · Frontier research · Multimodal · Vision · Agent orchestration · Fine-tuning
Amazon	Applied Science Manager, Alexa Edge AI	Big Tech	9	Multimodal · Model serving · Inference infra
Amazon	Applied Scientist, Alexa Edge AI	Big Tech	9	Vision · Multimodal · Fine-tuning · Frontier research · Model serving · Inference infra
Amazon	Applied Scientist, Alexa Edge AI	Big Tech	9	Multimodal · Vision · Fine-tuning · Model serving · Inference infra
Amazon	Applied Scientist, Alexa Edge AI	Big Tech	9	Vision · Multimodal · Fine-tuning · Model serving · Inference infra
Google	Senior Software Engineer, Applied AI Commerce	Big Tech	9	Agent orchestration · Multimodal · Evals · Guardrails · RAG · LLM observability · Tool use · Vision
Apple	Sr. Machine Learning Research Engineer, Siri Speech	Big Tech	9	Multimodal · Fine-tuning · Model serving · Inference infra · Frontier research
Meta	Research Scientist Intern, Multimodal AI (PhD)	Big Tech	9	Multimodal · Evals · Fine-tuning · LLM observability
Google	Senior Staff Software Engineer, Applied AI	Big Tech	9	Agent orchestration · Model serving · Inference infra · Fine-tuning · Evals · RL robotics
Apple	Machine Learning Architect - Conversational Speech	Big Tech	9	Multimodal · Model serving · Fine-tuning · Inference infra

Frequently asked questions

What is Audio & speech in AI?
Speech recognition, synthesis, and audio understanding — TTS, ASR, voice agents, and audio-native LLMs. Primary AI lifecycle stage: application.
How many AI roles reference Audio & speech right now?
301 active AI roles across 54 companies in our index reference Audio & speech as of today.
Which companies are hiring for Audio & speech roles?
The companies with the most active Audio & speech listings are: Google (62 roles), Amazon (49 roles), xAI (32 roles), Meta (12 roles), Apple (10 roles).
What AI lifecycle stage does Audio & speech belong to?
Audio & speech primarily belongs to the application stage of the AI lifecycle. In current hiring, Audio & speech roles concentrate at: post-training (24%), agents (20%).
What sectors invest most in Audio & speech?
The sectors with the most active Audio & speech hiring are: Big Tech, AI Frontier, Enterprise.