Voice AI Companies

Explore 33 Voice AI companies in our AI directory. Leading companies include Dialpad, Mobvoi, Suki AI.

33 Companies

Dialpad

San Francisco, United States

Dialpad is a US-based provider of an all-in-one cloud communications platform integrating voice, video, messaging, and a contact center solution. Their core technology leverages real-time Voice AI to provide features like automated call transcription, agent coaching, and autonomous workflow execution for tasks like appointment scheduling and refunds. Dialpad targets businesses seeking to improve contact center performance and streamline communications across multiple channels, with a focus on security and integration with existing CRM and collaboration tools.

scaleup $418M

Mobvoi

Beijing, China

Mobvoi is a Chinese technology company specializing in voice AI and intelligent wearables. Their core technology centers around a proprietary Chinese Natural Language Processing (NLP) engine powering voice assistants and features across their product line, most notably the TicWatch series of smartwatches. Mobvoi primarily targets the Chinese market with localized AI experiences, while also offering select wearables internationally with a focus on health and fitness tracking.

scaleup $300M

Suki AI

Redwood City, United States

Suki AI develops an ambient clinical intelligence platform that utilizes voice AI and natural language processing to automate clinical documentation workflows. Their technology captures and analyzes patient-physician conversations to generate comprehensive notes, orders, and instructions directly within existing Electronic Health Record (EHR) systems. Suki AI targets healthcare providers and organizations seeking to reduce administrative burden, improve physician burnout, and enhance revenue cycle management through streamlined documentation processes.

scaleup $165M

Poly AI

London, United Kingdom

Poly AI develops conversational AI solutions for enterprise contact centers, enabling fully autonomous handling of customer voice calls. Their core technology focuses on delivering highly natural, multilingual voice interactions that replicate human agent conversations, distinguishing them through a customer-led approach to AI training. Poly AI targets businesses seeking to scale customer service while maintaining a high-quality, localized brand experience, particularly within the hospitality and service industries.

scaleup $120M

ElevenLabs

New York, United States

ElevenLabs specializes in realistic voice AI, offering a platform for text-to-speech generation and voice cloning powered by proprietary models like their flagship voice agent technology. Their platform provides access to over 5,000 voices in 70+ languages, and recently expanded with the launch of the Iconic Marketplace featuring digitally-recreated voices of prominent figures such as Matthew McConaughey and Sir Michael Caine. ElevenLabs targets content creators, developers, and businesses seeking to integrate high-quality, customizable voice solutions into applications ranging from audiobooks and gaming to virtual assistants and accessibility tools.

startup $101M

Parloa

Berlin, Germany

Parloa delivers a generative AI-powered platform for contact center automation, enabling enterprises to deploy and manage personalized “AI agents” that handle high-volume customer interactions. Their technology orchestrates the full AI agent lifecycle – from development to deployment and optimization – focusing on complex tasks like scheduling, refunds, and personalized recommendations. Parloa targets large enterprises seeking to improve customer loyalty and efficiency, and their platform is designed for high-stakes environments requiring precision and scalability in customer communication.

startup $100M

Deepgram

San Francisco, United States

Deepgram is a US-based provider of voice AI APIs for enterprise applications, offering unified speech-to-text, text-to-speech, and LLM orchestration. Their platform distinguishes itself through a single API designed to minimize complexity, latency, and cost compared to component-based solutions, and supports both real-time and batch processing with telephony integrations. Deepgram targets developers and businesses requiring highly accurate and scalable voice intelligence for applications like contact centers, voice assistants, and conversational AI systems.

startup $86M

Speechmatics

Cambridge, United Kingdom

Speechmatics is a UK-based technology company specializing in accurate, low-latency Automatic Speech Recognition (ASR) and speech-to-text solutions. Their core offering is a Speech API providing transcription, real-time translation, and text-to-speech capabilities, deployable on-device, on-premise, or in the cloud. Speechmatics targets enterprises requiring high-quality voice AI with a focus on data privacy, offering a non-logging standard deployment option.

scaleup $70M

Papercup

London, United Kingdom

Papercup provides AI-powered dubbing and voice-over solutions for video content, utilizing a patented technology stack trained on extensive licensed voice data. Their platform combines synthetic voices with human editorial post-editing to deliver natural-sounding, culturally nuanced audio localization. Papercup targets enterprise-level content creators and media companies seeking scalable and cost-effective methods to expand global reach without sacrificing audience engagement.

startup $50M

Infinitus Systems

San Francisco, United States

Infinitus Systems develops a voice AI platform that automates administrative and clinical phone calls for U.S. healthcare providers and payers. Their technology specifically addresses time-consuming tasks like prior authorization and routine patient communication, utilizing AI agents to handle calls without human intervention. This solution aims to reduce administrative burden, improve staff productivity, and ultimately enhance patient outcomes within the healthcare system.

startup $50M

Hume AI

New York, United States

Hume AI builds empathic AI that understands and responds to human emotional expressions. Provides APIs for emotion recognition in voice, face, and language.

startup $50M

aiOla

Tel Aviv, Israel

aiOla transforms frontline speech into structured, validated data for enterprise systems. Voice-agentic workflows replace manual data entry.

startup $40M

Modulate

Cambridge, United States

Modulate is a US-based AI platform that analyzes live and recorded voice conversations to deliver real-time insights into content, intent, and emotional state. Their core technology decodes multi-dimensional voice signals – including deception, toxicity, and synthetic speech – to provide actionable alerts and APIs. Modulate targets businesses requiring enhanced fraud prevention, trust & safety measures, and customer experience improvements through proactive voice intelligence, serving sectors like gaming, contact centers, and online communities.

startup $36M

Decagon

San Francisco, United States

Decagon delivers AI-powered virtual agents for enterprise customer support, specializing in voice and chat channels. Their core technology focuses on customizable conversational AI with cross-channel memory, enabling personalized and connected customer interactions. Decagon targets companies seeking to significantly increase customer support deflection rates, scale operations to 24/7 availability, and improve key customer experience metrics like First Response Time and Customer Satisfaction.

startup $35M

PlayHT

San Francisco, United States

PlayHT is a US-based AI company specializing in realistic text-to-speech (TTS) and voice cloning technology delivered via API. Their platform offers over 200 AI voices in 40+ languages, focusing on low-latency synthesis for applications requiring natural-sounding, multi-speaker audio. PlayHT targets content creators and enterprises seeking to automate voiceovers and generate audio content at scale.

startup $29M

Cartesia

San Francisco, United States

Cartesia builds fast, realtime AI models for voice and speech. Their Sonic model enables sub-100ms latency text-to-speech for conversational AI.

startup $27M

Fixie.ai

Seattle, United States

Fixie.ai develops the Ultravox platform, enabling developers to build and deploy AI agents powered by a next-generation, open-source Speech Language Model (SLM). Ultravox focuses on natural speech understanding to facilitate more human-like conversational AI experiences. The company targets businesses seeking to integrate scalable voice AI capabilities into their applications and workflows.

startup $25M

LiveKit

San Francisco, United States

LiveKit is an open-source platform for building realtime audio and video applications. Powers voice AI agents with ultra-low latency infrastructure.

startup $23M

Bland AI

San Francisco, United States

Bland AI provides enterprises with AI-powered phone agents capable of handling both inbound and outbound calls using natural language processing. Their core technology centers on customizable voice models trained on client-provided recordings and transcriptions, offering a branded conversational experience. Targeting businesses across verticals like finance, healthcare, and logistics, Bland AI differentiates itself through on-premise data security and seamless integration capabilities for automating customer support, sales, and operational communications.

startup $16M

Rinna

Tokyo, Japan

Rinna is a Japanese AI company specializing in conversational AI and virtual character development. Their core technology centers around creating highly realistic AI personalities capable of natural language interactions, initially demonstrated through integrations with LINE and evolving into AI-powered virtual YouTubers (AITubers). Rinna targets businesses and entertainment sectors seeking to leverage advanced AI for customer engagement, content creation, and immersive digital experiences, with a strong focus on the Japanese market.

startup $15M

Resemble AI

San Francisco, United States

Resemble AI develops a generative AI platform specializing in voice and audio technology, offering products like real-time voice cloning via their Chatterbox model, and audio editing tools like Edit. Their key innovations include DETECT-3B Omni, a multi-modal deepfake detection model consistently ranked among the industry’s most robust, alongside PerTh, an AI-powered watermarking solution for content provenance. Resemble AI serves enterprise and government clients – including Fortune 500 companies – with solutions for content creation, security, and speaker verification, and is trusted by over 3 million teams worldwide.

startup $12M

Vapi

San Francisco, United States

Vapi provides a platform for developers to build and deploy configurable voice AI agents. Their core technology is a comprehensive API enabling advanced conversational AI functionality for phone-based applications. Vapi targets a broad market ranging from startups to large enterprises seeking to automate phone operations and create scalable voice AI products.

startup $12M

Murf AI

San Francisco, United States

Murf AI develops a text-to-speech (TTS) platform offering over 200 AI voices across 20+ languages, powering realistic voiceovers for video content, presentations, and marketing materials. Their core technology leverages advanced neural network architectures to generate highly natural-sounding speech, and they provide both a user-friendly AI Voice Generator and robust Text-to-Speech APIs & SDKs for developers. Murf AI serves a broad market including content creators, educators, and businesses seeking scalable voice solutions, and is recognized for its speed and efficiency in building voice agents.

commercial $10M

Retell AI

San Francisco, United States

Retell AI provides a platform for businesses to build and deploy AI-powered voice agents for automating phone calls. Their technology leverages real-time knowledge base synchronization and natural language processing to handle customer interactions, including navigating IVR systems, scheduling appointments, and facilitating warm transfers to live agents. Retell AI targets companies seeking to improve call center efficiency and customer service through scalable, automated phone solutions, as demonstrated by deployments with companies like Everise.

startup $5M

iFlytek

Hefei, China

and aiming for a professional, informative tone: iFlytek develops advanced AI-powered language solutions, including its core Jieli speech recognition platform and translation tools supporting over 60 languages. The company’s innovations center on deep learning models for accurate speech-to-text, text-to-speech, and machine translation, demonstrated in products like their real-time transcription services for meetings and content creation. As China’s leading provider in this space, iFlytek increasingly focuses on international expansion and serves sectors including education, digital marketing, and professional communication.

enterprise

Cosito

Boston, United States

MIT-founded startup building AI-powered microphones that let frontline teams log data by voice—no physical forms, no typing.

startup

Emotech

London, United Kingdom

Emotech develops multimodal AI solutions focused on enhancing customer and user interactions, with key products including a multilingual speech platform and customizable generative AI avatars. Their technology specializes in realistic AI-driven speech synthesis – notably offering Arabic chatbots with dialect support – and a unique AI-powered pronunciation assessment tool for language learning. Emotech targets businesses seeking to improve customer service, create immersive digital experiences, and innovate in areas like education and gaming, demonstrated by claims of a 30% boost in customer satisfaction for early adopters.

startup

Sonantic

London, United Kingdom

Sonantic develops realistic, emotionally-expressive AI voices for digital media. Their core technology utilizes a proprietary neural network trained on human performance data to generate nuanced vocal performances from text. Acquired by Spotify, Sonantic primarily serves the gaming, animation, and audiobook industries, offering a solution for scalable and high-quality voice acting.

startup

SoundHound

Santa Clara, United States

SoundHound AI develops and licenses voice AI technologies that enable conversational interfaces for a variety of industries, including automotive, retail, and finance. Their core offering is a fully independent voice AI platform capable of handling over 10 billion conversations annually, focusing on agentic AI solutions that automate complex tasks. SoundHound differentiates itself by offering a complete, customizable voice AI solution – rather than relying on cloud-based assistants – allowing businesses to own the entire interaction and maximize ROI through cost reduction and revenue generation.

enterprise

RingCentral

Belmont, United States

RingCentral provides a unified cloud communications platform integrating voice, video, messaging, and contact center solutions. Their core AI technology focuses on real-time conversation intelligence and automation within these communication channels, offering features like call transcription, sentiment analysis, and automated workflows. RingCentral targets businesses of all sizes seeking to improve agent productivity, enhance customer experiences, and gain actionable insights from their communications data.

enterprise

Acoustic.ai

Copenhagen, Denmark

Acoustic.ai develops voice AI solutions for automotive and consumer electronics, focusing on noise cancellation and voice enhancement.

commercial

Fireflies.ai

San Francisco, United States

Fireflies.ai develops an AI-powered meeting assistant that automatically transcribes, summarizes, and analyzes conversational data across various video conferencing platforms. Their core technology centers on speech-to-text conversion and natural language processing to identify speakers and extract key insights from meetings. Fireflies.ai targets professional teams seeking to improve meeting productivity and knowledge management through searchable conversation archives.

company

Rev AI

San Francisco, United States

Rev AI provides a speech-to-text API specializing in automated transcription and speech recognition services. Their core technology centers on a diverse, large-dataset trained AI model designed for high accuracy across varied audio qualities and accents. They target developers and businesses requiring scalable, programmatic transcription solutions for applications like voice search, media monitoring, and accessibility services.

company

Browse All Companies

Voice AI Companies

Dialpad

Mobvoi

Suki AI

Poly AI

ElevenLabs

Parloa

Deepgram

Speechmatics

Papercup

Infinitus Systems

Hume AI

aiOla

Modulate

Decagon

PlayHT

Cartesia

Fixie.ai

LiveKit

Bland AI

Rinna

Resemble AI

Vapi

Murf AI

Retell AI

iFlytek

Cosito

Emotech

Sonantic

SoundHound

RingCentral

Acoustic.ai

Fireflies.ai

Rev AI

Cookie Notice