Handshake acquires Cleanlab to automate data label auditing for AI training
Handshake is buying Cleanlab in an acqui-hire to improve human-labeled data quality with automated error detection for foundation model training.

Key Takeaways
- Handshake acquired Cleanlab largely as an acqui-hire, bringing nine key employees (including three MIT PhDs) into its research org.
- Cleanlab’s tech focuses on automated detection of incorrect labels without a second human reviewer, targeting label-noise reduction in training data.
- Cleanlab raised 30 million dollars and had more than 30 employees at its peak before the acquisition.
- Handshake was last valued at 3.3 billion dollars (2022) and was forecasted to end 2025 at 300 million dollars ARR, with external reporting suggesting “high hundreds of millions” ARR this year.
Data quality is becoming a competitive lever in model training, and Handshake is making a talent-heavy bet to tighten its labeling pipeline for AI.
Acqui-hire adds MIT PhDs to Handshake’s data labeling stack
Handshake, which started in 2013 as a college hiring network and later expanded into human data labeling for foundation model builders, has acquired Cleanlab, a startup known for software that audits labels produced by humans. The transaction terms were not disclosed, but the deal is positioned primarily as an acqui-hire: nine key Cleanlab employees are joining Handshake’s research organization, including co-founders Curtis Northcutt, Jonas Mueller, and Anish Athalye, all MIT computer science PhDs.
Cleanlab, founded in 2021, raised 30 million dollars from investors including Menlo Ventures, TQ Ventures, Bain Capital Ventures, and Databricks Ventures, and previously scaled to more than 30 employees.
Automated label error detection aims to raise training data reliability
Cleanlab’s core value is algorithms that flag likely incorrect labels without requiring a second human reviewer. For teams buying labeled datasets, this matters because label noise can degrade evaluation, fine-tuning, and reinforcement learning workflows, especially when tasks require expensive experts (medical, legal, scientific). Handshake says the acquisition strengthens its ability to identify weak spots in its models and systematically produce higher-quality data for AI labs.
Northcutt said the company had interest from other data-labeling firms, but chose Handshake partly because competitors often source expert labelers through Handshake’s platform. “If you're going to pick one, you should probably pick the source, not the middleman,” he told TechCrunch.
Handshake was last valued at 3.3 billion dollars in 2022 and was forecast to end 2025 at 300 million dollars in annualized revenue run rate. It is also reportedly on track to reach an ARR of “high hundreds of millions” this year, according to an external report from Upstarts Media. The company says it has provided data for eight top AI labs, including OpenAI.
For B2B teams building on foundation models, the takeaway is straightforward: labeling vendors are moving beyond workforce scale into measurable, software-driven quality control—and that should change how you evaluate datasets and suppliers.
Stay Informed
Weekly AI marketing insights
Join 5,000+ marketers. Unsubscribe anytime.
