Audio AI Companies

Explore 21 Audio AI companies in our AI directory. Leading companies include Spotify AI, Epidemic Sound, Suno.

21 Companies

Spotify AI

Stockholm, Sweden

and publicly available knowledge: Spotify leverages advanced machine learning models – including collaborative filtering and natural language processing – to power personalized music and podcast recommendations through features like Discover Weekly and Release Radar. Their AI capabilities extend to audio analysis for features such as the DJ automated DJ experience and podcast transcription, as well as content moderation systems designed to ensure platform safety. With over 574 million monthly active users globally, Spotify’s AI-driven personalization is a key differentiator in the competitive streaming market and contributes significantly to user engagement and retention.

enterprise $65.0B

Epidemic Sound

Stockholm, Sweden

Epidemic Sound is a Swedish provider of royalty-free music and sound effects for content creators. Their platform utilizes AI-powered search and recommendation algorithms to facilitate efficient content matching and discovery within a vast library of audio assets. Targeting video creators, marketers, and podcasters, Epidemic Sound offers a subscription-based licensing model providing unrestricted usage rights for their audio content globally.

scaleup $450M

Suno

Cambridge, United States

Suno is the leading AI music generation platform with 100M+ users. Generates $200M annual revenue. Raised $250M at $2.45B valuation backed by NVIDIA.

scaleup $375M

Suno

Cambridge, United States

Suno is a US-based generative AI company specializing in the creation of original music from text-based prompts. Their core technology utilizes AI models to compose full songs, including lyrics and instrumentation, allowing users to rapidly prototype and produce musical content. Suno targets a broad market including musicians, content creators, and hobbyists seeking accessible tools for music production and exploration, offering a platform for both creation and discovery.

startup $125M

Descript

San Francisco, United States

Descript develops a cross-platform audio and video editing platform centered around speech-to-text technology, enabling users to edit media by directly manipulating transcripts. Key innovations include Overdub, a realistic voice synthesis tool allowing users to correct or add to recordings using AI-generated speech, and Studio Sound, which enhances audio clarity with a single click. Targeting podcasters, video creators, and marketing teams, Descript has gained traction for its unique transcript-based workflow and recently launched Underlord, an AI-powered video editor capable of generating and editing video content from text prompts.

startup $100M

Papercup

London, United Kingdom

Papercup provides AI-powered dubbing and voice-over solutions for video content, utilizing a patented technology stack trained on extensive licensed voice data. Their platform combines synthetic voices with human editorial post-editing to deliver natural-sounding, culturally nuanced audio localization. Papercup targets enterprise-level content creators and media companies seeking scalable and cost-effective methods to expand global reach without sacrificing audience engagement.

startup $50M

Sanas

Palo Alto, United States

Sanas provides a real-time Speech AI platform specializing in accent and language translation for improved communication clarity. Their core technology modulates speech to neutralize accents and remove noise while preserving vocal characteristics, enabling natural-sounding conversations in over 25 languages. Sanas targets call centers and communication-heavy businesses seeking to enhance customer and employee experiences, reduce communication friction, and improve key performance indicators like CSAT and AHT.

startup $50M

PlayHT

San Francisco, United States

PlayHT is a US-based AI company specializing in realistic text-to-speech (TTS) and voice cloning technology delivered via API. Their platform offers over 200 AI voices in 40+ languages, focusing on low-latency synthesis for applications requiring natural-sounding, multi-speaker audio. PlayHT targets content creators and enterprises seeking to automate voiceovers and generate audio content at scale.

startup $29M

PlayAI

San Francisco, United States

PlayAI develops voice cloning and text-to-speech technology. Their platform creates custom AI voice models from audio samples, enabling natural-sounding speech synthesis for content creators and businesses.

startup $21M

Podcastle

Wilmington, United States

Podcastle is a US-based software company offering an all-in-one platform for video and podcast creation directly within a web browser. Their core technology centers on AI-powered tools for audio and video editing, including features for noise reduction, automatic editing, and AI voice generation. Podcastle targets long-form content creators seeking a streamlined, browser-based solution for recording, editing, and distributing professional-quality audio and video content.

startup $14M

Udio

New York, United States

Udio is a US-based generative AI company specializing in music creation. Their platform utilizes text-to-music AI technology, enabling users to generate complete songs from simple text prompts. Udio targets musicians, content creators, and hobbyists seeking rapid prototyping or royalty-free music generation capabilities.

startup $10M

Amper Music

New York, United States

Amper Music, a Shutterstock company, provides an AI-powered music composition platform that generates original, royalty-free tracks. Utilizing generative algorithms and machine learning, Amper enables content creators – including video producers, advertisers, and game developers – to quickly and affordably produce customized music tailored to specific moods, styles, and lengths. This solution streamlines the music licensing process and offers a cost-effective alternative to traditional music sourcing.

commercial $9M

Speechki

San Francisco, United States

Speechki is a text-to-speech platform offering 500+ AI voices in 77 languages. Backed by Greycroft and Alchemist, they enable content creators to convert text to natural-sounding audio at scale.

startup $5M

Endel

Berlin, Germany

Endel is a German technology company developing AI-powered generative audio environments designed to improve cognitive performance and wellbeing. Their core product utilizes a patented algorithm that creates personalized soundscapes adapting in real-time to user-specific data like time of day, weather, and biometrics. Endel targets individuals seeking to enhance focus, reduce stress, and improve sleep quality through scientifically-backed auditory experiences.

startup

iZotope

Cambridge, United States

iZotope develops advanced audio processing software leveraging machine learning for tasks like mixing, mastering, and dialogue editing. Their core technology centers on neural networks trained on vast datasets of professionally produced audio to deliver intelligent assistance and automated solutions for common audio challenges. Targeting audio engineers, musicians, and post-production professionals, iZotope provides tools that streamline workflows and enhance sonic quality with data-driven precision.

scaleup

Teachable Machine

Mountain View, United States

Teachable Machine is a web-based platform developed by Google that enables users to rapidly create machine learning models using a no-code interface. The platform focuses on image, audio, and pose-based recognition, allowing individuals to train custom models directly within their browser. Primarily targeting educators, artists, and hobbyists, Teachable Machine lowers the barrier to entry for machine learning by eliminating the need for programming expertise and facilitating quick prototyping for integration into web applications and creative projects.

enterprise

Acoustic.ai

Copenhagen, Denmark

Acoustic.ai develops voice AI solutions for automotive and consumer electronics, focusing on noise cancellation and voice enhancement.

commercial

AIVA

Luxembourg City, Luxembourg

AIVA is a Luxembourg-based company specializing in AI-driven music composition. Their core technology is a generative AI model capable of autonomously composing original soundtracks across a variety of genres and styles. AIVA targets content creators in film, gaming, and advertising seeking royalty-free, customizable music solutions, offering an alternative to traditional music licensing and composition.

commercial

Fish Audio

Shanghai, China

Fish Audio offers studio-grade AI text-to-speech and instant voice cloning with 1,000+ voices in 70+ languages. Their open-source models have gained significant developer adoption.

startup

Boomy

Berkeley, United States

Boomy develops a platform enabling users to create original music tracks via artificial intelligence. Their core technology utilizes generative AI models—specifically, a combination of diffusion and transformer models—to compose music across various genres based on user-defined parameters. Targeting both amateur musicians and content creators, Boomy uniquely allows users to commercially distribute and potentially earn royalties from AI-generated compositions.

company

Soundraw

Tokyo, Japan

Soundraw develops an AI-powered music generation platform focused on providing royalty-free music for content creators. Their core technology utilizes an in-house trained AI model to compose original instrumentals, allowing for granular customization via a stem-based mixer. Soundraw uniquely targets the need for legally safe, customizable background music, enabling monetization opportunities for users without copyright concerns.

company

Browse All Companies

Audio AI Companies

Spotify AI

Epidemic Sound

Suno

Suno

Descript

Papercup

Sanas

PlayHT

PlayAI

Podcastle

Udio

Amper Music

Speechki

Endel

iZotope

Teachable Machine

Acoustic.ai

AIVA

Fish Audio

Boomy

Soundraw

Cookie Notice