Models that process or generate across modalities — text, images, audio, video — within a single architecture; covers training, fine-tuning, and application-layer integration.
Primary AI lifecycle stage: pre-training, post-training, and application.
As of today, 1,073 active AI roles across 126 companies in our index reference Multimodal. Hiring concentrates at the agents (34%) and post-training (21%) stages. Most common sectors: Big Tech, Enterprise, Semiconductors. New postings fell 30% in the last 30 days versus the prior 30 (473 → 332).
Models that process or generate across modalities — text, images, audio, video — within a single architecture; covers training, fine-tuning, and application-layer integration. Primary AI lifecycle stage: pre-training, post-training, and application.
1,073 active AI roles across 126 companies in our index reference Multimodal as of today. New postings fell 30% in the last 30 days versus the prior 30 (473 → 332).
The companies with the most active Multimodal listings are: Amazon (227 roles), Google (104 roles), NVIDIA (80 roles), Adobe (77 roles), Apple (41 roles).
Multimodal primarily belongs to the pre-training, post-training, and application stages of the AI lifecycle. In current hiring, Multimodal roles concentrate at: agents (34%), post-training (21%).
The sectors with the most active Multimodal hiring are: Big Tech, Enterprise, Semiconductors.
15 AI roles tagged multimodal.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| Disney | Lead Machine Learning Engineer | Media | 9 | Agent orchestration · Agent research · RAG · LLM observability · Evals · Guardrails · Model serving · Inference infra |
| Disney | Sr Staff R&D Engineer | Media | 9 | Audio & speech · Fine-tuning · Model serving |
| Warner Bros Discovery | Sr. Principal Data Scientist | Media | 8 | Recommender systems · Forecasting |
| Disney | Sr Data Scientist | Media | 8 | Fine-tuning · RAG · Inference infra · Model serving · Evals · Vector DB |
| Comcast | Machine Learning Engineer (GoLang) | Media | 8 | Agent orchestration · Tool use · RAG · Vector DB · Fine-tuning · Model serving · Vision · Audio & speech |
| Disney | Lead Data Scientist, Ad Research | Media | 8 | Evals · Agent orchestration · Vision |
| Disney | Senior Machine Learning Engineer, Ad Platforms | Media | 8 | Agent orchestration · Fine-tuning · Model serving · Evals · Audio & speech |
| Disney | Lead Machine Learning Engineer, Ads Research | Media | 8 | Agent orchestration · Fine-tuning · Model serving · Audio & speech |
| Disney | Lead Machine Learning Engineer, Ad Platforms | Media | 8 | Recommender systems · Search & ranking · Fine-tuning · RAG · LLM observability · Vision · Evals |
| AppLovin | Applied Research Scientist | Media | 8 | Recommender systems |
| Disney | Lead Software Engineer | Media | 7 | Vision · Model serving |
| Disney | Production Innovation Technologist (PH) | Media | 7 | Fine-tuning · Model serving |
| Disney | Lead Data Scientist, Content Intelligence | Media | 7 | Fine-tuning · Evals · LLM observability · RAG · Vector DB · Vision |
| Warner Bros Discovery | Senior Data Scientist - Video AI team, Bangalore | Media | 7 | Model serving · Inference infra · RAG · Vector DB · Fine-tuning · Evals · Vision |
| Disney | Disney Research Intern | Media | 7 | Vision |