Which companies are hiring for Audio & speech roles?

The companies with the most active Audio & speech listings are: Google (62 roles), Amazon (49 roles), xAI (32 roles), Meta (12 roles), Apple (10 roles).

What AI lifecycle stage does Audio & speech belong to?

Audio & speech primarily belongs to the application stage of the AI lifecycle. In current hiring, Audio & speech roles concentrate at: post-training (24%), agents (20%).

What sectors invest most in Audio & speech?

The sectors with the most active Audio & speech hiring are: Big Tech, AI Frontier, Enterprise.

← Tag co-occurrence network

Audio & speech

Speech recognition, synthesis, and audio understanding — TTS, ASR, voice agents, and audio-native LLMs.

Primary AI lifecycle stage: application.

As of today, 301 active AI roles across 54 companies in our index reference Audio & speech. Hiring concentrates at the post-training (24%) and agents (20%) stages. Most common sectors: Big Tech, AI Frontier, Enterprise.

Top hiring:

Sector

All Big Tech · 268 AI Frontier · 94 Enterprise · 45 Semiconductors · 22 Vertical AI · 18 Consumer · 14 Data AI · 8 Banking · 8 Media · 6 Telecom · 4 Multimodal · 4 Pharma · 3 Hospitality · 2 Fintech · 2 Robotics · 1 Consulting · 1 Auto · 1

Function

All Engineering · 373 Research · 102 Product · 26

Status

All Active only

Sort

AI score Recently posted Company A–Z

FilteredsectorEnterprise×

45 AI roles tagged audio_speech.

Company	Title	Sector	AI score	Other tags
Oracle	Snr Director, Applied Science	Enterprise	9	Multimodal · Agent orchestration · Model serving · Inference infra · RAG · Evals · Guardrails · LLM observability · Vision
Adobe	Sr Applied Scientist, Generative AI/ML	Enterprise	9	Multimodal · Fine-tuning · Pretraining · Model serving · Vision
Adobe	Staff Applied Scientist, Generative AI/ML	Enterprise	9	Multimodal · Pretraining · Fine-tuning · Model serving · Frontier research · Vision
Adobe	Senior Machine Learning Engineer	Enterprise	9	Model serving · Fine-tuning · Multimodal · Vision · Inference infra
Adobe	Principal Scientist, Generation Data Architect (Image & Video & Audio)	Enterprise	9	Multimodal · Synthetic data · Fine-tuning · Vision
Adobe	Principal Scientist - Applied Research, ASML (Multimodal Foundation Models)	Enterprise	9	Multimodal · Synthetic data · Fine-tuning · Frontier research · Vision
Salesforce	Research Scientist - Salesforce AI Research	Enterprise	9	Agent orchestration · Multimodal · Vision · LLM observability · Fine-tuning · Frontier research · Agent research · Model serving · Evals
Adobe	2026 University Graduate - Applied Scientist	Enterprise	9	Fine-tuning · Pretraining · Multimodal · Vision
Adobe	2026 University Graduate - Research Scientist/Engineer	Enterprise	9	Fine-tuning · RL post-training · Frontier research · Multimodal
Adobe	Senior Research Scientist	Enterprise	9	Multimodal · Frontier research · Fine-tuning · Pretraining
Adobe	Applied Scientist, Generative AI/ML	Enterprise	9	Multimodal · Fine-tuning · Pretraining · Model serving · Vision · Frontier research
Canva	Senior Research Scientist - Audio & Video AI	Enterprise	9	Vision · Multimodal · Fine-tuning · Frontier research · Model serving
Oracle	[REMOTE] Principal Applied Scientist	Enterprise	8	Model serving · Vision
Adobe	Applied Scientist	Enterprise	8	Fine-tuning · Multimodal · Vision · Synthetic data · Pretraining
Verkada	AI Software Engineering Intern - Fall 2026	Enterprise	8	Multimodal · Vision · Model serving · Inference infra
Zendesk	Principal Voice AI Engineer	Enterprise	8	Fine-tuning · Model serving
Moveworks	Principal Product Manager, Search Platform	Enterprise	8	Agent orchestration · Semantic search · Model serving · Inference infra
Oracle	Architect, Applied Scientist (Reasoning Platform, AI Lead)	Enterprise	7	Model serving · Recommender systems · Vision
Notion	Engineering Manager, Mobile AI	Enterprise	7	Agent orchestration · Tool use · Evals · Guardrails · Multimodal
ClickUp	Senior AI Engineer, Voice Platform	Enterprise	7	Fine-tuning · LLM observability · Model serving · Multimodal
Adobe	Software Development Manager - Adobe Digital Audio Engineering	Enterprise	7	Multimodal
Adobe	Software Development Engineer	Enterprise	7	Agent orchestration · Evals · Guardrails · LLM observability · Multimodal · Model serving
Verkada	Senior Backend Engineer - Virtual Guard	Enterprise	7	Multimodal · Vision · Inference infra · Model serving · Agent orchestration
Notion	Software Engineer, AI Capture	Enterprise	7	Agent orchestration · Model serving · LLM observability
Canva	Senior Machine Learning Engineer - Content ML (AU remote)	Enterprise	7	Vision
Canva	Senior Machine Learning Engineer - Content ML (AU remote)	Enterprise	7	Vision
Canva	Senior Machine Learning Engineer - Content ML (AU remote)	Enterprise	7	Vision
Canva	Senior Machine Learning Engineer - Content ML (AU remote)	Enterprise	7	Multimodal · Vision
Toast	Senior Software Engineer, Voice AI	Enterprise	7	Inference infra · Model serving · LLM observability
Toast	Principal Product Manager, Voice AI	Enterprise	7	Agent orchestration · Evals · Inference infra · Model serving

Frequently asked questions

What is Audio & speech in AI?
Speech recognition, synthesis, and audio understanding — TTS, ASR, voice agents, and audio-native LLMs. Primary AI lifecycle stage: application.
How many AI roles reference Audio & speech right now?
301 active AI roles across 54 companies in our index reference Audio & speech as of today.
Which companies are hiring for Audio & speech roles?
The companies with the most active Audio & speech listings are: Google (62 roles), Amazon (49 roles), xAI (32 roles), Meta (12 roles), Apple (10 roles).
What AI lifecycle stage does Audio & speech belong to?
Audio & speech primarily belongs to the application stage of the AI lifecycle. In current hiring, Audio & speech roles concentrate at: post-training (24%), agents (20%).
What sectors invest most in Audio & speech?
The sectors with the most active Audio & speech hiring are: Big Tech, AI Frontier, Enterprise.