Speech Recognition Companies

Explore 31 Speech Recognition companies in our AI directory. Leading companies include Uniphore, Verbit, Dialpad.

31 Companies

Uniphore

Chennai, India

Uniphore provides an enterprise-grade Business AI Cloud platform focused on bridging the gap between consumer and business AI applications. Their core technology centers on a composable and secure AI architecture encompassing data, knowledge, models, and agents, with a strong emphasis on speech analytics and conversational AI. Uniphore targets large enterprises seeking to deploy and manage AI solutions across their operations with a focus on data sovereignty and control.

scaleup $610M

Verbit

New York, United States

Verbit is a US-based AI company specializing in highly accurate transcription and captioning services. Their core technology, the Captivate™ ASR engine and enhanced by Gen.V™ generative AI, delivers rapid, customizable transcripts with automated summarization and keyword extraction. Verbit primarily serves speech-intensive industries like legal and education, offering solutions to improve accessibility, enhance productivity, and derive actionable insights from audio and video content.

scaleup $550M

Dialpad

San Francisco, United States

Dialpad is a US-based provider of an all-in-one cloud communications platform integrating voice, video, messaging, and a contact center solution. Their core technology leverages real-time Voice AI to provide features like automated call transcription, agent coaching, and autonomous workflow execution for tasks like appointment scheduling and refunds. Dialpad targets businesses seeking to improve contact center performance and streamline communications across multiple channels, with a focus on security and integration with existing CRM and collaboration tools.

scaleup $418M

Observe.AI

San Francisco, United States

Observe.ai provides AI Agents for enterprise contact centers, automating and improving customer interactions across voice channels. Their technology utilizes advanced speech recognition and natural language processing to accurately understand complex, real-world conversations – even with background noise and interruptions – and integrate with existing CRM and workflow systems. This enables businesses to automate call resolution, improve agent performance through AI-powered quality assurance, and achieve predictable outcomes in customer service operations.

scaleup $213M

Loom

San Francisco, United States

Loom is a video messaging platform that enables asynchronous communication through quick screen and camera recordings. Utilizing automatic speech recognition (ASR) technology, Loom provides searchable video transcripts and captions for improved accessibility and information retrieval. Primarily targeting professionals and teams, Loom streamlines communication and documentation workflows, offering a more efficient alternative to traditional email and meetings.

enterprise $203M

AISpeech

Suzhou, China

AISpeech is a leading specialized large-model conversational AI platform company in China, enabling intelligent connectivity and streamlined operations.

scaleup $200M

Cogito

Boston, United States

Cogito, now part of Verint, delivers real-time AI-powered coaching and performance analytics for contact centers. Their core technology utilizes proprietary AI models to analyze voice conversations, providing both customer experience (CX) and employee experience (EX) scoring during live calls. This enables targeted, in-the-moment guidance for agents, with a focus on improving key metrics like average handle time, customer satisfaction, and revenue generation for large enterprises in sectors like telecommunications and healthcare.

scaleup $130M

AssemblyAI

San Francisco, United States

AssemblyAI develops highly accurate speech-to-text APIs, including their flagship LeMUR model, and a suite of audio intelligence features like speaker diarization, entity detection, and topic detection. Their key innovation lies in offering low-latency, high-accuracy transcription optimized for real-time and asynchronous applications, alongside advanced features like content moderation and redaction. Serving a diverse market including contact centers, media companies, and research institutions, AssemblyAI processes millions of minutes of audio data monthly and is recognized for consistently achieving industry-leading Word Error Rates (WER) in independent evaluations.

startup $115M

Otter.ai

Mountain View, United States

Otter.ai develops AI-powered meeting solutions, most notably its Otter Meeting Agent platform, which provides real-time transcription, automated summaries, and AI-driven action item detection. The platform leverages advanced speech recognition and natural language processing to create searchable meeting records and facilitate collaboration, integrating with popular video conferencing tools like Zoom, Google Meet, and Microsoft Teams. Otter.ai currently serves a broad professional market, with reported user testimonials indicating significant time savings – up to 33% according to one VP of Sales at Aiden Technologies – and increased productivity for teams reliant on frequent meetings.

startup $113M

Chorus.ai

San Francisco, United States

Chorus.ai, now integrated within ZoomInfo, delivers conversation intelligence software that analyzes sales calls and meetings. Their platform utilizes AI-powered speech and text analytics to identify key conversation patterns, coaching opportunities, and deal-critical insights. This technology primarily serves revenue-focused teams within B2B organizations to improve sales performance and forecasting accuracy.

startup $100M

Ambience Healthcare

San Francisco, United States

Ambience Healthcare provides an AI-powered platform that automates clinical documentation and coding for U.S. healthcare systems. Utilizing natural language processing and speech recognition, the platform generates structured data from patient encounters, reducing administrative burden on clinicians. Ambience targets health systems seeking to improve revenue cycle management, ensure compliance, and allow physicians to focus on patient care rather than documentation.

scaleup $100M

Deepgram

San Francisco, United States

Deepgram is a US-based provider of voice AI APIs for enterprise applications, offering unified speech-to-text, text-to-speech, and LLM orchestration. Their platform distinguishes itself through a single API designed to minimize complexity, latency, and cost compared to component-based solutions, and supports both real-time and batch processing with telephony integrations. Deepgram targets developers and businesses requiring highly accurate and scalable voice intelligence for applications like contact centers, voice assistants, and conversational AI systems.

startup $86M

Speechmatics

Cambridge, United Kingdom

Speechmatics is a UK-based technology company specializing in accurate, low-latency Automatic Speech Recognition (ASR) and speech-to-text solutions. Their core offering is a Speech API providing transcription, real-time translation, and text-to-speech capabilities, deployable on-device, on-premise, or in the cloud. Speechmatics targets enterprises requiring high-quality voice AI with a focus on data privacy, offering a non-logging standard deployment option.

scaleup $70M

Corti

Copenhagen, Denmark

Corti is a Danish AI infrastructure provider specializing in healthcare applications. Their core product is a highly accurate medical Automatic Speech Recognition (ASR) API that converts clinical conversations into structured data and documentation. Corti targets healthcare developers and providers seeking to rapidly build and deploy voice-enabled workflows – such as automated note-taking, report generation, and point-of-care support – without managing complex AI infrastructure.

startup $60M

Sanas

Palo Alto, United States

Sanas provides a real-time Speech AI platform specializing in accent and language translation for improved communication clarity. Their core technology modulates speech to neutralize accents and remove noise while preserving vocal characteristics, enabling natural-sounding conversations in over 25 languages. Sanas targets call centers and communication-heavy businesses seeking to enhance customer and employee experiences, reduce communication friction, and improve key performance indicators like CSAT and AHT.

startup $50M

Modulate

Cambridge, United States

Modulate is a US-based AI platform that analyzes live and recorded voice conversations to deliver real-time insights into content, intent, and emotional state. Their core technology decodes multi-dimensional voice signals – including deception, toxicity, and synthetic speech – to provide actionable alerts and APIs. Modulate targets businesses requiring enhanced fraud prevention, trust & safety measures, and customer experience improvements through proactive voice intelligence, serving sectors like gaming, contact centers, and online communities.

startup $36M

Fano Labs

Hong Kong, Hong Kong

Fano Labs specializes in speech recognition and NLP for Asian languages, serving financial services and customer service industries.

commercial $30M

Fixie.ai

Seattle, United States

Fixie.ai develops the Ultravox platform, enabling developers to build and deploy AI agents powered by a next-generation, open-source Speech Language Model (SLM). Ultravox focuses on natural speech understanding to facilitate more human-like conversational AI experiences. The company targets businesses seeking to integrate scalable voice AI capabilities into their applications and workflows.

startup $25M

Krisp

San Francisco, United States

Krisp develops AI-powered tools to enhance the quality and productivity of virtual meetings. Their core product is an AI Meeting Assistant that combines industry-leading noise cancellation with automated transcription, summarization, and accent conversion. Krisp targets professionals and teams seeking to improve communication clarity and efficiency in remote and hybrid work environments by automating key meeting tasks.

startup $17M

Voiceitt

Tel Aviv, Israel

Voiceitt develops AI-powered speech recognition technology specifically designed to understand non-standard speech patterns, including those resulting from speech impairments, accents, or aging-related conditions. Their core product is a customizable API and software solution leveraging a proprietary database of atypical speech and advanced machine learning. Voiceitt primarily serves individuals with speech disabilities, as well as accessibility applications for accented speakers and those in the Deaf community, enabling greater communication independence and access to voice-controlled technologies.

startup $15M

Soapbox Labs

Dublin, Ireland

SoapBox Labs develops voice AI specifically designed for children, enabling speech recognition in educational apps with child privacy protection.

commercial $11M

Speechly

Helsinki, Finland

Speechly is a Finnish company specializing in real-time Automatic Speech Recognition (ASR) technology delivered via a streaming API. Their core product is a cloud-based ASR engine optimized for low-latency transcription and understanding, particularly in demanding applications like real-time communication and interactive voice response systems. Speechly targets developers building voice-enabled applications requiring high accuracy and speed, offering a developer-friendly alternative to traditional, batch-oriented speech-to-text solutions.

startup $10M

Lelapa AI

Johannesburg, South Africa

Lelapa AI develops Natural Language Processing (NLP) technology specifically for African languages, originating from the Masakhane research community. Their core product, the Vulavula API, provides resource-efficient speech-to-text and transcription services for real-time call processing and analysis. Lelapa AI targets businesses operating in African markets seeking to improve customer experience, ensure compliance, and gain actionable insights from multilingual customer interactions.

startup $3M

Rev.com

Austin, United States

Rev.com provides AI-powered transcription and captioning services, specializing in solutions for the legal industry. Their core offering is a 96%+ accurate AI transcription engine designed for high-volume processing of legal evidence like depositions, police reports, and bodycam footage, supplemented by a network of 14,000+ human transcriptionists for 99%+ accuracy when required. Rev targets law firms and legal professionals by offering tools for evidence review, timeline creation, and secure transcript management directly within their platform.

scaleup $0M

iFlytek

Hefei, China

and aiming for a professional, informative tone: iFlytek develops advanced AI-powered language solutions, including its core Jieli speech recognition platform and translation tools supporting over 60 languages. The company’s innovations center on deep learning models for accurate speech-to-text, text-to-speech, and machine translation, demonstrated in products like their real-time transcription services for meetings and content creation. As China’s leading provider in this space, iFlytek increasingly focuses on international expansion and serves sectors including education, digital marketing, and professional communication.

enterprise

Nuance Communications

Burlington, United States

Nuance Communications, now a Microsoft company, develops AI-powered solutions for clinical and administrative healthcare documentation. Their core technology centers on speech recognition and natural language processing applied to create tools like Dragon Medical One, which automates clinical documentation and enhances radiology reporting. Nuance primarily serves healthcare providers and aims to improve clinician productivity, reduce administrative burden, and enhance patient care through AI-driven workflows.

enterprise

SoundHound

Santa Clara, United States

SoundHound AI develops and licenses voice AI technologies that enable conversational interfaces for a variety of industries, including automotive, retail, and finance. Their core offering is a fully independent voice AI platform capable of handling over 10 billion conversations annually, focusing on agentic AI solutions that automate complex tasks. SoundHound differentiates itself by offering a complete, customizable voice AI solution – rather than relying on cloud-based assistants – allowing businesses to own the entire interaction and maximize ROI through cost reduction and revenue generation.

enterprise

Speak AI

Toronto, Canada

Speak AI is a Canadian company specializing in AI-powered transcription and analysis of audio and video data. Their core product utilizes Automatic Speech Recognition (ASR) and Natural Language Processing (NLP) to convert media into searchable, transcribed text and extract key insights. Speak AI primarily serves researchers and businesses needing to efficiently process and analyze qualitative data from interviews, meetings, and other spoken content.

startup

Whisper (OpenAI)

San Francisco, United States

OpenAI’s Whisper is an open-source automatic speech recognition (ASR) system trained on a massive, diverse 680,000-hour dataset of multilingual speech. Utilizing a Transformer-based encoder-decoder architecture, Whisper excels in robustness to accents and background noise, offering both transcription and translation capabilities across multiple languages. This technology targets developers seeking to integrate highly accurate and versatile speech-to-text functionality into a wide range of applications, particularly where diverse audio conditions or multilingual support are critical.

startup

Acoustic.ai

Copenhagen, Denmark

Acoustic.ai develops voice AI solutions for automotive and consumer electronics, focusing on noise cancellation and voice enhancement.

commercial

Rev AI

San Francisco, United States

Rev AI provides a speech-to-text API specializing in automated transcription and speech recognition services. Their core technology centers on a diverse, large-dataset trained AI model designed for high accuracy across varied audio qualities and accents. They target developers and businesses requiring scalable, programmatic transcription solutions for applications like voice search, media monitoring, and accessibility services.

company

Browse All Companies

Speech Recognition Companies

Uniphore

Verbit

Dialpad

Observe.AI

Loom

AISpeech

Cogito

AssemblyAI

Otter.ai

Chorus.ai

Ambience Healthcare

Deepgram

Speechmatics

Corti

Sanas

Modulate

Fano Labs

Fixie.ai

Krisp

Voiceitt

Soapbox Labs

Speechly

Lelapa AI

Rev.com

iFlytek

Nuance Communications

SoundHound

Speak AI

Whisper (OpenAI)

Acoustic.ai

Rev AI

Cookie Notice