Collibra, a global leader in unified data and AI governance, has announced its acquisition of Deasy Labs, an innovative startup known for its advanced capabilities in automated discovery and enrichment of unstructured data. With this strategic move, Collibra becomes the first platform to offer true unified governance spanning both structured and unstructured data, creating a reliable foundation for AI, analytics, and compliance initiatives.
Through this acquisition, Collibra significantly expands its ability to process and transform unstructured content including contracts, transcripts, emails, and reports into trusted, AI-ready assets. By automating classification, filtering, and enrichment processes, customers will be able to unlock value from unstructured data with minimal manual effort, using intelligent, auto-generated taxonomies that drive more precise GenAI outputs, smarter search capabilities, and stronger regulatory compliance.
“As organizations scale their use of AI, the ability to unlock the value of unstructured data becomes critical,” said Felix Van de Maele, Collibra Founder and CEO. “Deasy Labs gives us the ability to tag, filter and enrich this dark data at scale automatically turning unstructured files into structured, meaningful and trusted data assets ready for AI. This is a leap forward for the industry, and for Collibra’s vision of unified data and AI governance.”
Turning Unstructured Data Into Business Value
Unstructured data including PDFs, meeting transcripts, and emails represents more than 90% of enterprise data, yet much of it remains underutilized. Deasy Labs’ technology addresses this challenge by connecting directly to these sources, discovering relevant taxonomies, enriching documents with structured metadata, and integrating this information into downstream tools such as semantic search engines, GenAI assistants, and AI-based automation workflows.
Founded in 2023 by a team of AI and metadata experts from McKinsey & Company, and backed by the Y Combinator program, Deasy Labs has quickly earned trust across sectors such as financial services and healthcare. The platform’s ability to deliver automated semantic enrichment eliminates the need for complex AI pipelines or costly manual labeling, accelerating insights and reducing operational burdens.
“Our mission at Deasy Labs has always been to help organizations make sense of the massive volume of unstructured content they generate every day,” said Reece Griffiths, co-founder of Deasy Labs. “By joining Collibra, we can now bring that mission to life at scale integrating deep semantic enrichment into a platform that’s already trusted by the world’s leading data teams.”
Also Read: YugabyteDB Boosts Vector Search, PostgreSQL for AI Apps
Seamless Integration with the Collibra Platform
Deasy Labs’ technology will be embedded within the Collibra Platform over the coming months, enabling customers to take advantage of:
-
Smart Discovery: Quickly scan repositories and identify high-value content through advanced semantic tagging, accelerating AI readiness.
-
Automated Semantic Layer: Replace manual metadata creation with auto-enrichment processes that structure content and align it to business taxonomies.
-
Enterprise-Scale AI Search: Support high-performance enterprise search and GenAI experiences even at massive data volumes by adding contextual understanding to unstructured assets.
This integration reinforces Collibra’s commitment to empowering enterprises with full lifecycle governance of AI from initial data discovery and enrichment to policy enforcement and lineage across all data types.
“Unifying governance across all structured and unstructured data into trusted, governed data assets is no longer optional,” said Sanjeev Mohan, Principal at SanjMo and former Gartner Analyst. “Metadata-driven automation is key to unlocking the hidden value in documents, emails, and transcripts as it brings much-needed visibility and control to the least governed parts of the data estate. By bringing unstructured data into the fold of unified governance, Collibra is taking a critical step toward operationalizing AI at scale with confidence.”