Data Labeling Companies

Explore 77 Data Labeling companies in our AI directory. Leading companies include Palantir, Procore AI, Indigo Agriculture.

77 Companies
Palantir logo - Data Labeling AI company

Palantir

Denver, United States

Palantir Technologies develops data integration and analysis platforms leveraging artificial intelligence and machine learning. Their core products, including Foundry and Gotham, enable organizations to connect disparate data sources and operationalize insights through a central operating system. Palantir primarily serves government agencies and large enterprises requiring complex data analysis for operational decision-making and strategic planning, particularly in sectors like defense, finance, and healthcare.

enterprise $50.0B
Procore AI logo - Data Labeling AI company

Procore AI

Carpinteria, United States

Procore Technologies provides a comprehensive construction management platform utilized throughout all phases of a project, from preconstruction to closeout. Their core innovation is Procore Helix, an AI-powered intelligence layer that leverages analytics and agentic workflows to automate tasks and deliver actionable insights. Procore targets general contractors and construction managers seeking to improve project efficiency, safety, and financial outcomes through data-driven decision-making.

enterprise $10.0B
Indigo Agriculture logo - Data Labeling AI company

Indigo Agriculture

Boston, United States

Indigo Agriculture is an agricultural technology company focused on improving farm profitability and sustainability through a combination of biological seed treatments and data-driven regenerative agriculture programs. Their core technology utilizes naturally occurring microbes to enhance crop resilience and yield, coupled with a digital platform to measure and verify soil carbon sequestration for carbon credit generation. Indigo targets both farmers seeking increased productivity and corporations aiming to reduce Scope 3 emissions and meet sustainability goals through traceable, agriculture-based carbon offsets.

scaleup $1.2B
Hopper logo - Data Labeling AI company

Hopper

Montreal, Canada

Hopper is a travel booking platform leveraging AI-powered price prediction to assist consumers with travel planning. Their core technology analyzes billions of data points to forecast future price fluctuations for flights, hotels, and car rentals, recommending optimal booking times. Targeting cost-conscious travelers, Hopper aims to deliver the lowest price on travel accommodations through data-driven insights and proactive price monitoring.

scaleup $735M
Fivetran logo - Data Labeling AI company

Fivetran

Oakland, United States

Fivetran is a data integration platform that automates the movement of data from over 700 sources – including SaaS applications, databases, and files – into cloud data warehouses and lakes. Their fully-managed pipelines eliminate the need for custom ETL code, enabling businesses to consolidate data for analytics and AI initiatives. Fivetran primarily serves organizations requiring reliable, scalable data integration to power business intelligence, data science workflows, and cloud migrations.

scaleup $730M
Scale AI logo - Data Labeling AI company

Scale AI

San Francisco, United States

ScaleAI provides data labeling and annotation services critical for developing and deploying machine learning models. Their core offering is a managed data platform utilizing human-in-the-loop workflows and proprietary technology to deliver high-quality training data at scale. ScaleAI primarily serves companies operating in data-intensive AI applications like autonomous vehicles, geospatial intelligence, and robotics, accelerating their model development lifecycles.

scaleup $600M
Own Company logo - Data Labeling AI company

Own Company

Englewood Cliffs, United States

Own from Salesforce delivers data protection and backup solutions specifically for Software-as-a-Service (SaaS) applications. Utilizing AI-powered automation, the platform proactively monitors, protects, and manages SaaS data to ensure resilience against loss, corruption, and compliance risks. Primarily targeting organizations reliant on SaaS for critical business functions, Own simplifies data privacy management and accelerates development through secure test data generation within the Salesforce ecosystem.

scaleup $500M
Gogoro logo - Data Labeling AI company

Gogoro

Taipei, Taiwan

Gogoro is a Taiwanese technology company specializing in electric two-wheel vehicle solutions and battery swapping infrastructure. Their core offering is a networked battery swapping platform—utilizing data analytics and smart station management—that enables rapid, automated battery exchange for electric scooters and motorcycles. Targeting both consumers and commercial fleets, particularly in densely populated urban environments, Gogoro aims to overcome range anxiety and accelerate the adoption of electric micro-mobility.

enterprise $480M
dbt Labs logo - Data Labeling AI company

dbt Labs

Philadelphia, United States

dbt Labs provides a data transformation platform focused on enabling reliable and governed data pipelines. Their core product, dbt Fusion, is a next-generation data engine designed to accelerate analytics and AI initiatives through improved performance and cost efficiency. dbt Labs targets data teams within organizations seeking to improve data quality and governance as a foundation for trustworthy AI and data-driven decision-making.

scaleup $414M
Aiven logo - Data Labeling AI company

Aiven

Helsinki, Finland

Aiven provides a managed, open-source data platform enabling organizations to efficiently stream, store, and serve data across multi-cloud environments. Their core offering integrates and optimizes popular open-source data technologies – including databases, streaming platforms, and search engines – as a fully-managed service. Aiven targets data-intensive businesses seeking to reduce operational overhead and accelerate AI/ML initiatives by simplifying complex data infrastructure management.

scaleup $410M
Lattice logo - Data Labeling AI company

Lattice

San Francisco, United States

Lattice is a US-based HR technology platform that utilizes AI-powered analytics to optimize people management processes. Their core product integrates performance reviews, employee engagement surveys, and compensation data to provide actionable insights for HR professionals and managers. Lattice targets mid-to-large sized organizations seeking to improve employee performance, increase retention, and reduce administrative burden within their HR departments.

scaleup $328M
Maven Clinic logo - Data Labeling AI company

Maven Clinic

New York, United States

Maven Clinic is a virtual care platform providing comprehensive women’s and family health support to employers and health plans. Utilizing a data-driven approach and on-demand access to specialists – including reproductive endocrinologists and mental health providers – Maven aims to improve health outcomes and reduce costs associated with fertility, pregnancy, and postpartum care. Their core value proposition centers on proactive, personalized support throughout all paths to parenthood, ultimately driving employee retention and lowering healthcare expenditures for organizations.

scaleup $300M
Sisense logo - Data Labeling AI company

Sisense

New York, United States

Sisense provides an embedded analytics platform that allows businesses to integrate AI-powered data modeling, visualization, and customization directly into their products and workflows. Their technology focuses on simplifying the development of end-to-end analytics experiences with a flexible approach spanning pro-code, low-code, and no-code options. Sisense primarily serves companies looking to monetize data assets or enhance existing products with data-driven insights, reducing the need for extensive custom development.

scaleup $273M
Immuta logo - Data Labeling AI company

Immuta

Boston, United States

Immuta provides data access control and governance software for enterprises managing complex data environments. Their platform utilizes automated workflows and policy enforcement to streamline data access requests, enabling self-service access while maintaining compliance with data privacy regulations. Immuta targets large organizations in regulated industries like finance, healthcare, and government seeking to accelerate data utilization and overcome bottlenecks in data access provisioning.

scaleup $267M
Monte Carlo logo - Data Labeling AI company

Monte Carlo

San Francisco, United States

Monte Carlo provides a data observability platform specializing in monitoring and troubleshooting AI/ML pipelines, from data inputs to model outputs. Their technology focuses on identifying and resolving data quality issues that impact the reliability and trustworthiness of AI agents in production. Monte Carlo targets enterprise teams seeking to increase confidence in their AI investments and accelerate adoption by ensuring data integrity and mitigating risks associated with inaccurate or biased results.

startup $236M
Kiteworks logo - Data Labeling AI company

Kiteworks

Palo Alto, United States

Kiteworks provides a Private Data Network (PDN) platform that secures sensitive data exchanges both internally and externally. Their solution utilizes AI-powered anomaly detection alongside encryption and least-privilege access controls to govern file sharing and email communications. Kiteworks targets highly regulated industries and organizations requiring robust data privacy and compliance, integrating with existing security infrastructure to minimize risk and prevent data breaches.

scaleup $231M
Placer.ai logo - Data Labeling AI company

Placer.ai

Los Altos, United States

Placer.ai is a location analytics provider that leverages AI-powered foot traffic data to deliver insights into consumer behavior and location performance. Their core product utilizes mobile location data to quantify visits, demographics, and movement patterns around points of interest. Placer.ai primarily serves retail, real estate, and economic development organizations seeking to optimize site selection, measure campaign effectiveness, and understand market trends.

scaleup $214M
BigID logo - Data Labeling AI company

BigID

New York, United States

BigID provides an enterprise-grade data security platform that discovers, classifies, and manages sensitive data across all on-premise, cloud, and SaaS environments. Utilizing patented AI-driven classification with over 1,000 pre-trained classifiers, the platform enables organizations to automate data privacy, compliance, and risk remediation at scale. BigID targets large enterprises needing comprehensive data visibility and control to meet evolving regulatory requirements and mitigate data-related risks.

scaleup $212M
SafeGraph logo - Data Labeling AI company

SafeGraph

Denver, United States

SafeGraph is a US-based data provider specializing in comprehensive points-of-interest (POI) data for geospatial analysis. They leverage machine learning and human verification to curate a highly accurate and detailed database of global POI attributes – including brand affiliation, hours, and polygon geometry – delivered through platforms like AWS Marketplace and Domo. This data enables businesses in sectors like real estate, retail, and analytics to improve market trend identification, mapping applications, and consumer behavior analysis.

scaleup $200M
Zego logo - Data Labeling AI company

Zego

London, United Kingdom

Zego is a UK-based InsurTech company providing commercial motor insurance solutions. They leverage app-based telematics and data analytics to offer flexible, usage-based policies primarily targeting the growing gig economy and businesses with variable driving needs. This approach allows Zego to offer competitive pricing and incentivize safer driving habits through personalized insurance coverage.

scaleup $200M
Tomorrow.io logo - Data Labeling AI company

Tomorrow.io

Boston, United States

Tomorrow.io is a weather intelligence platform that leverages proprietary space-based sensors and AI models to deliver hyperlocal and highly accurate weather forecasts. Their core product is a Weather API providing 60+ data layers and historical data, optimized for integration with AI agents and automated workflows. Targeting businesses across industries, Tomorrow.io enables proactive risk mitigation and operational optimization in the face of increasingly severe weather events.

startup $190M
Labelbox logo - Data Labeling AI company

Labelbox

San Francisco, United States

Labelbox provides a comprehensive data factory platform enabling organizations to build and operationalize high-quality training data for AI models. Their core technology centers on software and services for data labeling, data generation, and model evaluation, with a focus on advancements like Reinforcement Learning with Verifiable Rewards (RLVR). Labelbox targets AI teams requiring scalable, standardized, and scientifically rigorous data pipelines to accelerate the development and deployment of computer vision and NLP applications.

startup $188M
Timescale logo - Data Labeling AI company

Timescale

New York, United States

Timescale is a data infrastructure company specializing in TimescaleDB, an open-source PostgreSQL extension optimized for time-series data. Their platform enables scalable storage and analysis of time-series data – including metrics, events, and streams – with features like hybrid row/columnar storage and continuous aggregates. Timescale targets developers building applications requiring high-performance time-series analytics, particularly in areas like IoT, DevOps, and industrial telemetry, offering a cost-effective alternative to traditional time-series databases.

scaleup $181M
Halter logo - Data Labeling AI company

Halter

Auckland, New Zealand

Halter develops precision livestock management technology for dairy farms, utilizing solar-powered, GPS-enabled collars to create virtual fences and monitor animal behavior. Their system transmits data via a localized network, eliminating the need for cellular coverage, and provides farmers with real-time insights into grazing patterns, animal health, and heat detection via a mobile application. This allows for optimized pasture management, improved animal welfare, and increased farm productivity without the infrastructure costs of traditional fencing.

scaleup $175M
EightSleep logo - Data Labeling AI company

EightSleep

New York, United States

EightSleep develops the Pod, an AI-powered mattress cover that utilizes active temperature regulation and biometric monitoring to optimize sleep performance. The Pod tracks metrics like heart rate variability, respiratory rate, and sleep stages, dynamically adjusting temperature on each side of the bed for personalized comfort. Targeting consumers seeking data-driven sleep improvement and enhanced recovery, EightSleep offers a non-invasive solution compatible with existing mattresses.

scaleup $160M
Oura logo - Data Labeling AI company

Oura

Oulu, Finland

Oura is a Finnish health technology company specializing in a smart ring wearable that tracks key physiological data. Utilizing sensor data and AI-powered algorithms, the Oura Ring continuously monitors metrics like sleep stages, activity levels, and heart rate variability to provide users with personalized health insights. Oura targets health-conscious individuals seeking proactive, data-driven approaches to wellness and recovery, offering a discreet and comfortable alternative to traditional wrist-worn trackers.

scaleup $148M
Hawkeye 360 logo - Data Labeling AI company

Hawkeye 360

Herndon, United States

Hawkeye 360 is a U.S.-based company that utilizes a proprietary constellation of Low Earth Orbit satellites and advanced RF signal processing to deliver space-based signals intelligence (SIGINT). Their core technology detects, geolocates, and characterizes a broad spectrum of radio frequency signals – including radar, communications, and GPS – providing customers with unique insights into maritime, terrestrial, and airborne activity. Hawkeye 360 primarily serves government and defense organizations seeking enhanced situational awareness, threat detection, and intelligence gathering capabilities.

scaleup $145M
Snorkel AI logo - Data Labeling AI company

Snorkel AI

Redwood City, United States

Snorkel AI is a data-centric AI platform specializing in the programmatic development of high-quality training datasets. Their core technology utilizes programmatic labeling techniques to accelerate data curation and improve model accuracy, particularly for large language models and enterprise AI applications. Snorkel AI targets organizations requiring specialized, rapidly-developed datasets to optimize performance and reduce the time-to-deployment of their AI initiatives.

startup $135M
Reonomy logo - Data Labeling AI company

Reonomy

New York, United States

Reonomy is a U.S.-based PropTech company specializing in commercial real estate data intelligence. They utilize machine learning algorithms to standardize and connect disparate property data sources through a proprietary identification system, the Reonomy ID. This technology enables investors, brokers, and owners to unlock deeper insights from existing datasets and identify off-market opportunities within the commercial real estate sector.

startup $128M
Mode Analytics logo - Data Labeling AI company

Mode Analytics

San Francisco, United States

Mode Analytics provides a collaborative data platform that unifies SQL, Python, and R-based analysis with visual analytics tools. Their platform enables data teams to rapidly develop and deploy analytical insights while simultaneously empowering business users with self-service reporting capabilities. Mode targets data-driven organizations seeking to accelerate analytical workflows and improve cross-functional collaboration around data assets, without requiring rigid data modeling or extensive implementation.

scaleup $127M
Appen logo - Data Labeling AI company

Appen

Sydney, Australia

Appen provides comprehensive data solutions for the artificial intelligence lifecycle, specializing in the collection, annotation, and curation of high-quality training data. Their core offering is a customizable platform and services suite designed to deliver scalable, auditable datasets crucial for developing and refining both foundation and enterprise-level AI models, with a particular focus on supporting generative AI applications. Appen targets enterprises seeking to reliably deploy AI by addressing the critical need for trustworthy and diverse training data at scale.

enterprise $100M
Nansen logo - Data Labeling AI company

Nansen

Singapore, Singapore

Nansen is a Singapore-based blockchain analytics platform specializing in on-chain data for the Web3 ecosystem. Their core product utilizes AI-powered analysis of over 300 million labeled wallet addresses across 20+ blockchains to identify emerging trends and “Smart Money” activity. Nansen primarily serves professional crypto investors and teams seeking data-driven insights for due diligence and informed trading decisions within the cryptocurrency market.

scaleup $87M
Sylvera logo - Data Labeling AI company

Sylvera

London, United Kingdom

Sylvera is a UK-based climate tech company providing a data and analytics platform for the voluntary and compliance carbon credit markets. Utilizing machine learning and multi-scale lidar technology, Sylvera rates the quality of carbon credits and delivers pricing intelligence, supply/demand forecasts, and retirement data. Their platform serves enterprises, investors, and governments seeking transparency and risk mitigation in carbon offsetting and investment decisions.

startup $80M
Arago logo - Data Labeling AI company

Arago

Frankfurt, Germany

Arago, now operating as Almato, develops Bardioc, a semantic data platform that leverages knowledge graph technology to automate IT and enterprise operations. The platform ingests and models organizational data to deliver actionable insights and recommendations, particularly for provisioning and base configuration tasks. Almato primarily targets governments and highly regulated industries seeking to improve operational efficiency through data-driven automation.

enterprise $70M
Lean Technologies logo - Data Labeling AI company

Lean Technologies

Riyadh, Saudi Arabia

Lean Technologies is a Saudi Arabian fintech company providing open banking APIs that enable secure data exchange and payment processing for financial institutions and fintechs across the Middle East and North Africa (MENA) region. Their platform utilizes AI-driven data enrichment to standardize and categorize financial data, facilitating services like account verification, income assessment, and streamlined payments. Lean targets banks and fintech companies seeking to build and scale innovative financial products by leveraging reliable and accessible banking infrastructure.

startup $67M
Sama logo - Data Labeling AI company

Sama

San Francisco, United States

Sama provides data annotation and model evaluation services specializing in computer vision and generative AI applications. They combine human expertise with proprietary AI-assisted tooling to deliver high-quality training data, accelerating model development and reducing costs for their clients. Sama primarily serves large technology companies, including 40% of FAANG businesses, and distinguishes itself through a commitment to ethical AI sourcing and a focus on rapid iteration and data governance.

startup $66M
Castor logo - Data Labeling AI company

Castor

Amsterdam, Netherlands

Castor is a Netherlands-based technology company providing a unified electronic clinical data management system (CDMS) for the healthcare and life sciences industries. Their platform integrates electronic data capture (EDC), eConsent, ePRO, and other essential tools to streamline clinical trial workflows and facilitate both traditional and decentralized trial designs. Castor targets academic researchers, medical device companies, and pharmaceutical organizations seeking a cost-effective, self-service solution to manage clinical trial data from initial study build through analysis.

scaleup $65M
Keepit logo - Data Labeling AI company

Keepit

Copenhagen, Denmark

Keepit provides independent data protection for SaaS applications, specializing in backup and recovery as a service. Their core technology is an air-gapped, immutable cloud storage architecture designed to isolate backup data from production environments and potential threats. Targeting organizations requiring robust data resilience and compliance with regulations like GDPR and HIPAA, Keepit offers a cost-effective alternative to relying on SaaS provider backup solutions and ensures complete data control.

scaleup $60M
Lifebit logo - Data Labeling AI company

Lifebit

London, United Kingdom

Lifebit provides a data intelligence platform specializing in federated genomic and health data analysis for biomedical research. Their core technology is a federated lakehouse enabling secure, cross-institutional data access and analysis without requiring data transfer. Lifebit targets research organizations and healthcare institutions seeking to accelerate precision medicine discoveries while maintaining data privacy and compliance.

scaleup $60M
Supermetrics logo - Data Labeling AI company

Supermetrics

Helsinki, Finland

Supermetrics is a marketing intelligence platform that automates data integration from numerous online and offline sources into data warehouses, analytics platforms, and activation tools. Their core technology centers on a data pipeline that cleans, standardizes, and maps marketing data to revenue, enabling comprehensive performance analysis. Supermetrics primarily serves marketing agencies and brands seeking to improve data accuracy, streamline reporting, and demonstrate marketing’s direct impact on business growth through a unified data view.

scaleup $50M
Cherre logo - Data Labeling AI company

Cherre

New York, United States

Cherre provides a data integration and analytics platform specifically for the real estate industry. Their core technology unifies disparate real estate data sources into a single source of truth, leveraging data modeling and standardization to improve data quality and accessibility. This enables real estate investors, lenders, and developers to enhance decision-making, streamline operations, and accelerate AI initiatives through reliable, connected data.

startup $50M
DefinedAI logo - Data Labeling AI company

DefinedAI

Seattle, United States

DefinedAI operates a marketplace for AI training data, connecting organizations with ethically sourced and annotated datasets for machine learning applications. Their core offering is a platform facilitating the buying, selling, and custom commissioning of diverse data types – including visual and transcribed data – with a focus on commercial safety and creator rights. DefinedAI targets companies developing and deploying AI, particularly those prioritizing ethical data practices and high-performance generative AI solutions.

scaleup $50M
Kognic logo - Data Labeling AI company

Kognic

Gothenburg, Sweden

Kognic is a Swedish AI company providing a data platform specifically for the development of autonomous vehicle systems. Their core technology facilitates the scalable integration of human-in-the-loop feedback for complex annotation tasks beyond basic perception, including trajectory ranking and multimodal data grounding. Kognic targets autonomous vehicle developers seeking to improve model performance through high-volume, high-quality training data while maintaining safety and compliance.

startup $47M
Bodo logo - Data Labeling AI company

Bodo

Pittsburgh, United States

Bodo is a US-based high-performance computing platform that accelerates Python-based data analytics and AI workloads. Their core technology is a compute engine designed to dramatically speed up and scale existing Python code – particularly pandas workflows – without requiring code rewrites. Bodo targets data science and analytics teams needing to process large datasets (terabytes to petabytes) with improved performance and reduced infrastructure costs, while maintaining compatibility with the Python ecosystem and avoiding vendor lock-in.

startup $45M
Garner Health logo - Data Labeling AI company

Garner Health

New York, United States

Garner Health is a U.S.-based company applying AI to improve healthcare access and cost-efficiency for self-insured employers. Their core product is a clinical AI platform that analyzes provider data – including quality metrics, cost variations, and utilization patterns – to identify high-value care options. This enables employers to steer employees towards demonstrably better providers, ultimately lowering healthcare expenditures and improving employee outcomes.

startup $45M
V7 logo - Data Labeling AI company

V7

London, United Kingdom

V7 is a UK-based company offering an AI agent platform, V7 Go, focused on automating complex document processing workflows. Utilizing computer vision and data labeling technologies, V7 Go enables businesses to build and deploy specialized AI agents for tasks like contract analysis, claims processing, and financial document review. Their target market is primarily within the finance, legal, and insurance industries seeking auditable and automated solutions for knowledge work.

startup $42M
Synerise logo - Data Labeling AI company

Synerise

Krakow, Poland

Synerise is a Polish technology company specializing in a customer data platform (CDP) powered by its proprietary AI engine, Cleora. Cleora utilizes behavioral data processing and machine learning to predict customer behavior and automate personalized marketing experiences. Synerise targets enterprise-level businesses seeking to improve customer engagement and revenue through advanced data-driven personalization.

scaleup $40M
MOSTLY AI logo - Data Labeling AI company

MOSTLY AI

Vienna, Austria

MOSTLY AI is an Austrian company specializing in synthetic data generation for machine learning applications. Their platform utilizes generative adversarial networks (GANs) to create statistically accurate, privacy-safe datasets mirroring sensitive information. MOSTLY AI targets enterprises and data science teams requiring access to realistic data for AI model training and testing while adhering to data privacy regulations like GDPR.

startup $31M
Encord logo - Data Labeling AI company

Encord

London, United Kingdom

Encord provides data annotation and model evaluation tools for building safer AI, with focus on quality control and active learning.

commercial $30M
HumanSignal logo - Data Labeling AI company

HumanSignal

San Francisco, United States

HumanSignal develops Label Studio, a fully-configurable, open-source data labeling platform designed for complex data types including audio, video, and timeseries data. Their technology focuses on enabling high-quality, human-in-the-loop annotation with customizable UIs, workflow orchestration, and traceable oversight for compliant AI development. HumanSignal serves both enterprise clients seeking to embed unique data into AI systems and frontier AI labs requiring novel datasets, offering both a self-serve platform and managed services for dataset creation and annotation.

startup $25M
SuperAnnotate logo - Data Labeling AI company

SuperAnnotate

Sunnyvale, United States

SuperAnnotate is a data annotation platform specializing in tools for computer vision model training. Their core technology provides a feedback-driven pipeline for creating and evaluating high-quality training data, focusing on iterative improvement and quality control. The platform targets machine learning teams across diverse industries requiring robust and accurate labeled datasets for image and video-based AI applications.

startup $22M
Pula logo - Data Labeling AI company

Pula

Nairobi, Kenya

Pula provides agricultural insurance and data solutions to mitigate climate risk for smallholder farmers and agribusinesses across Africa and Asia. Their core technology combines index-based and indemnity-based insurance models with geospatial data analytics to deliver scalable, data-driven coverage and insights. Pula uniquely serves both farmers and the organizations that rely on them – including governments and exporters needing to meet traceability requirements like the EUDR – fostering resilience and sustainable agricultural practices.

startup $20M
Atlas AI logo - Data Labeling AI company

Atlas AI

Palo Alto, United States

Atlas AI is a geospatial intelligence company that leverages satellite imagery and machine learning to generate predictive economic indicators. Their core product is a platform providing granular, real-time forecasts of supply and demand across various sectors, particularly in emerging markets. This data enables investors, governments, and development organizations to make data-driven decisions regarding resource allocation and economic strategy in complex environments.

startup $20M
Surge AI logo - Data Labeling AI company

Surge AI

San Francisco, United States

Surge AI is a US-based data labeling company specializing in high-quality training data for Reinforcement Learning from Human Feedback (RLHF) and Large Language Models (LLMs). Their core offering is a managed data labeling platform focused on complex annotation tasks like preference labeling and reward modeling, critical for aligning AI behavior. Surge AI targets AI developers and research teams building and refining generative AI applications requiring nuanced human input for optimal performance.

startup $14M
Voxel51 logo - Data Labeling AI company

Voxel51

Ann Arbor, United States

Voxel51 provides a data curation platform specializing in tools for computer vision model development. Their core technology enables automated data transformation, cleaning, and analysis of large-scale visual datasets – reportedly processing over 20TB – to improve model accuracy. Voxel51 targets AI development teams seeking to optimize the performance of their computer vision applications through enhanced data quality and efficiency.

startup $13M
Apheris logo - Data Labeling AI company

Apheris

Berlin, Germany

Apheris provides a federated learning platform that enables collaborative AI model training across distributed, sensitive datasets – specifically within the life sciences industry. Their technology allows organizations to securely leverage both proprietary and public data to improve model accuracy and generalizability, without data ever leaving their control. This addresses a key challenge in life sciences AI development, where data diversity is limited by privacy and intellectual property concerns.

startup $12M
Understand.ai logo - Data Labeling AI company

Understand.ai

Karlsruhe, Germany

Understand.ai is a German data services company specializing in ground truth data creation and annotation for advanced driver-assistance systems (ADAS) and autonomous driving (AD) applications. Their core offering is a scalable ground truth platform leveraging automation and a partner network to deliver high-quality, consistently annotated datasets – including lidar, camera, and radar data – at volume. They primarily serve automotive manufacturers and technology companies developing and validating autonomous vehicle technologies, with a focus on data security evidenced by their TISAX Level 2 certification.

scaleup $12M
Sigmoid logo - Data Labeling AI company

Sigmoid

Bengaluru, India

Sigmoid is a data engineering and AI solutions provider specializing in the implementation of MLOps and advanced analytics for enterprise clients. The company delivers value through services encompassing data fabric modernization, end-to-end data management, and the development of Domain-Specific Language Models (DSLMs) and Agentic AI solutions – as recognized in Gartner’s 2026 Strategic Technology Trends report. Sigmoid targets organizations seeking to operationalize AI quickly and demonstrate proven ROI on their analytics investments, particularly within the CPG and supply chain sectors.

scaleup $12M
Browse AI logo - Data Labeling AI company

Browse AI

Vancouver, Canada

Browse AI provides a no-code web scraping and data extraction platform that transforms websites into structured APIs. Their technology utilizes automated site layout monitoring and human behavior emulation to reliably extract data – including from dynamic content – without requiring coding or technical expertise. They serve a broad user base seeking to automate data collection from virtually any website for integration into their existing systems, and also offer fully-managed data extraction services for customized needs.

startup $9M
Supahands logo - Data Labeling AI company

Supahands

Kuala Lumpur, Malaysia

Supahands provides AI training data services, combining human expertise with machine learning for data annotation and content moderation.

commercial $8M
Quilt Data logo - Data Labeling AI company

Quilt Data

San Francisco, United States

Quilt Data provides a data management platform specifically for life sciences research and development teams. Utilizing data versioning and AI-powered metadata tagging, Quilt organizes raw data, results, and associated metadata into searchable, versioned assets directly within a customer’s AWS environment. This enables improved data governance, collaboration, and accelerated research by establishing a single source of truth for complex scientific data.

startup $6M
Presight.ai logo - Data Labeling AI company

Presight.ai

Abu Dhabi, United Arab Emirates

Presight.ai provides AI-powered analytics solutions for enterprises and government organizations seeking to derive insights from their existing data infrastructure. Their core technology is a data governance-focused AI analyst that operates directly on data warehouses and lakes, ensuring zero-hallucination results by answering questions solely from verified metrics. This approach eliminates data leakage and the need for data movement, offering a scalable and compliant alternative to traditional or general-purpose AI analytics platforms.

scaleup
Farmers Edge logo - Data Labeling AI company

Farmers Edge

Winnipeg, Canada

Farmers Edge, now operating under Corvian, delivers enterprise-level digital transformation solutions for the agriculture, food, and finance industries. Their core offering is a precision agriculture platform leveraging satellite imagery, IoT data, and patented technologies to optimize crop performance and supply chain efficiency. Corvian differentiates itself through a managed services framework combining agronomic expertise with robust technology infrastructure, enabling scalable deployments for large organizations.

scaleup
Second Spectrum logo - Data Labeling AI company

Second Spectrum

Los Angeles, United States

Here's a company description for Second Spectrum, based on the provided information: Second Spectrum is a US-based sports analytics company specializing in automated tracking and data collection using computer vision. Their core product, Performance Studio, delivers detailed player and ball movement data to professional sports leagues like the NBA. This technology enables teams and broadcasters with objective insights for performance analysis, strategy development, and enhanced content creation.

startup
Catapult logo - Data Labeling AI company

Catapult

Melbourne, Australia

Catapult provides wearable sensor technology and video analysis tools to collect and interpret athletic performance data. Their platform utilizes machine learning algorithms to provide objective insights into athlete workload, technique, and injury risk. Catapult primarily serves professional and elite-level sports teams and organizations seeking data-driven performance optimization and injury prevention strategies.

scaleup
KAYAK logo - Data Labeling AI company

KAYAK

Stamford, United States

KAYAK is a metasearch engine that aggregates travel data from hundreds of websites to provide comprehensive flight, hotel, and car rental options. Their core technology utilizes price prediction algorithms and user-defined alerts to forecast price fluctuations and identify optimal booking times. Targeting cost-conscious travelers, KAYAK delivers a single platform for comparison shopping and itinerary management, eliminating the need to manually check multiple sources.

enterprise
Deepen AI logo - Data Labeling AI company

Deepen AI

San Jose, United States

Deepen AI provides safety-first data lifecycle tools and services for autonomous systems with multi-sensor data labeling and calibration.

startup
Sigma AI logo - Data Labeling AI company

Sigma AI

Louisville, United States

Sigma AI is a US-based provider of managed data labeling services specializing in high-quality training data for artificial intelligence models. They utilize a quality-first approach, employing human-in-the-loop techniques and rigorous quality assurance protocols to deliver accurate and reliable datasets. Sigma AI targets AI development teams requiring scalable and dependable data annotation, particularly those prioritizing model performance and minimizing bias.

startup
Sourcepoint logo - Data Labeling AI company

Sourcepoint

New York, United States

Sourcepoint provides a data privacy management platform for digital publishers and enterprises navigating complex consent regulations like GDPR and CCPA. Their technology utilizes AI-powered automation to manage consumer consent preferences, streamline vendor risk assessment, and facilitate compliance across digital properties. This enables clients to operationalize data privacy at scale while balancing regulatory requirements with business objectives, particularly within the advertising ecosystem.

scaleup
Entegra logo - Data Labeling AI company

Entegra

Greenwood Village, United States

Entegra is a U.S.-based procurement services provider specializing in cost optimization and supply chain management for the food service and hospitality industries. They leverage a data-driven platform to provide clients with savings on food and supplies, alongside tools for order management and contract compliance across a network of 2,500+ suppliers. Entegra targets hospitality businesses seeking to improve operational efficiency and reduce costs through data-backed procurement strategies and market insights.

enterprise
EQT Group logo - Data Labeling AI company

EQT Group

Stockholm, Sweden

EQT Group is a global investment organization leveraging proprietary AI technology to enhance investment analysis and drive value creation within its portfolio companies. Their core AI product focuses on data-driven insights for due diligence, identifying growth opportunities, and optimizing operational performance across a range of industries. Targeting the private equity market, EQT differentiates itself by integrating AI directly into the investment lifecycle, aiming to deliver superior returns and sustainable growth for its investments.

enterprise
Nearmap logo - Data Labeling AI company

Nearmap

Sydney, Australia

Nearmap provides frequently updated, high-resolution aerial imagery and location data to professional users. Their core technology utilizes AI-powered image analysis to extract insights – such as building footprints, vegetation analysis, and change detection – from this imagery. Primarily targeting governments, insurance companies, and construction/infrastructure businesses, Nearmap delivers geospatial data to support planning, assessment, and operational decision-making.

enterprise
Parse.ly logo - Data Labeling AI company

Parse.ly

New York, United States

Parse.ly provides content analytics and data pipeline infrastructure specifically for digital media companies and marketers. Their platform utilizes real-time event-level data processing to deliver an accessible dashboard and API for content performance insights and personalized content recommendations. By simplifying data access and analysis, Parse.ly enables editorial and marketing teams to optimize content strategy and demonstrate ROI without requiring specialized data science expertise.

scaleup
Planet Labs logo - Data Labeling AI company

Planet Labs

San Francisco, United States

Planet Labs provides daily, global satellite imagery and geospatial analytics to commercial and government clients. Their core offering leverages a large constellation of satellites – including next-generation platforms like Owl™ and Pelican – and AI-powered analysis to deliver frequent, high-resolution Earth observation data. This enables customers to monitor change, gain insights, and make data-driven decisions across sectors like agriculture, mapping, and defense.

enterprise
Protegrity logo - Data Labeling AI company

Protegrity

Salt Lake City, United States

Protegrity provides data security and privacy solutions for enterprises leveraging modern data environments, including cloud and analytics platforms. Their core technology, SecureAgentic AI, governs and secures AI agents and workflows to ensure data governance and compliance. Protegrity targets organizations needing to protect sensitive data while enabling trustworthy and compliant AI-driven insights and operations.

enterprise
Trifacta logo - Data Labeling AI company

Trifacta

San Francisco, United States

Trifacta, now integrated with Alteryx, provides a cloud-based data preparation platform leveraging AI-powered data wrangling capabilities. Their core product utilizes intelligent profiling and transformation suggestions to accelerate the process of cleaning, shaping, and preparing data for analysis. Trifacta targets data analysts and scientists seeking to rapidly build automated data pipelines and derive insights from diverse datasets within the Alteryx ecosystem.

scaleup
Testworks logo - Data Labeling AI company

Testworks

Seoul, South Korea

Testworks employs people with disabilities to provide AI training data services, combining social impact with quality data annotation.

commercial