Archives

Snowflake Adds Apache Iceberg for Open, AI Data

Snowflake

Snowflake, the AI Data Cloud company, announced the expansion of its core platform capabilities — including unmatched performance, secure data sharing, and advanced data protection — to support Apache Iceberg™ tables, one of the fastest-growing open table formats. This strategic move empowers organizations to activate their data faster and more efficiently, with zero data movement and seamless interoperability, driving open lakehouse adoption and accelerating the development of AI-driven applications.

For years, organizations have faced a tradeoff: choosing between the robust functionality of integrated data platforms or the flexibility of open formats like Parquet. With Snowflake’s full support for Apache Iceberg, that compromise is no longer necessary. Customers can now store, manage, and analyze data in a fully open and interoperable format — all while continuing to leverage the power, simplicity, and trust of Snowflake’s platform. The result is enhanced performance without vendor lock-in, giving global enterprises greater agility to execute on AI strategies.

“The future of data is open, but it also needs to be easy,” said Christian Kleinerman, EVP of Product, Snowflake. “Customers shouldn’t have to choose between open formats and best-in-class performance or business continuity. With Snowflake’s latest Iceberg tables innovations, customers can work with their open data exactly as they would with data stored in the Snowflake platform, all while removing complexity and preserving Snowflake’s enterprise-grade performance and security.”

Unlocking Next-Gen Analytics, Security, and Data Sharing with Iceberg Tables

With its enhanced Iceberg support, Snowflake is enabling customers to take advantage of:

Lakehouse Analytics
Organizations can now run powerful analytics on Iceberg tables using Snowflake’s native compute engine. Soon-to-be generally available services such as Search Optimization Service and Query Acceleration Service will further boost performance. Snowflake’s managed Iceberg tables offer the openness of external formats while preserving the platform’s renowned cost-performance benefits. The company is also actively working with the Apache Iceberg community to support complex data types, such as VARIANT.

Enterprise-Grade Security and Resilience
Snowflake extends its security and governance capabilities to Iceberg tables, offering intuitive controls to help organizations maintain compliance and secure their open lakehouse environments. In addition, Snowflake is piloting reliable data replication and syncing for Iceberg tables (currently in private preview), enabling fast recovery from system outages, cyber incidents, or other disruptions — all within an open architecture.

Data Sharing Without Boundaries
By integrating its industry-leading secure data sharing technology with Iceberg tables, Snowflake allows customers to share, monetize, and access data with the same ease and security they’ve come to expect from Snowflake’s native tables.

Also Read: Lovelytics & Nousot Merge to Lead Databricks Consulting

Advancing the Open Source Ecosystem and Data Innovation

Snowflake continues to champion open standards and community-driven innovation. Over the past four years, 35% of the company’s acquisitions have focused on strengthening open data ecosystems. This reflects a deep commitment to building tools that enhance transparency and interoperability across the data landscape.

Key open source contributions include:

  • Apache Iceberg™: Snowflake contributes to Iceberg’s capabilities for governed data lake management, supporting schema evolution, partitioning, and transaction integrity.

  • Apache NiFi: Through Datavolo (acquired in 2024), Snowflake simplifies data ingestion and real-time pipeline orchestration.

  • Apache Polaris™ (Incubating): Designed to fight vendor lock-in, Polaris delivers enterprise security and Iceberg compatibility across cloud providers.

  • Modin: Acquired in 2023, Modin enables effortless scaling of pandas workloads.

  • Streamlit: This 2022 acquisition lets users build and share rich data applications and visual dashboards.

  • TruEra: Acquired in 2024, TruEra brings enhanced AI model explainability and performance monitoring, supporting bias detection and regulatory compliance.

Customers Highlight the Power of Snowflake + Iceberg

Illumina: “By running analytics on Apache Iceberg tables with Snowflake, we’ve unlocked flexibility and performance in managing our manufacturing system data at scale. This open architecture allows us to seamlessly analyze vast datasets while maintaining cost efficiency, delivering faster insights to improve manufacturing processes and faster access to critical data for self-service. Snowflake’s support for Iceberg has not only improved our data agility but also reinforced the industry-wide push toward open standards, ensuring that innovation in genomics remains accessible, scalable, and impactful for the entire scientific community.” — Stephen Horn, Staff Data Solutions Architect, Illumina

Komodo Health: “At Komodo Health, our mission is to reduce the global burden of disease through our comprehensive Healthcare Map®, platform, tooling, and analytics solutions. Apache Iceberg and open source catalogs like Polaris Catalog have been transformative in helping us create actionable and governed insights from complex healthcare data. Open table formats provide the flexibility, interoperability, and enhanced data governance we need, while Snowflake’s unparalleled performance capabilities ensure we can scale these insights effectively with maximum efficiency. Together, this powerful technology foundation empowers us to make healthcare data more accessible and actionable, ultimately improving patient outcomes across the healthcare ecosystem.” — Laurent Bride, Chief Technology Officer, Komodo Health

Medidata, a Dassault Systèmes brand: “Innovations like Apache Iceberg are critical for Medidata and drive usability for our products like Clinical Data Studio to help our customers achieve faster, more flexible, and simpler data operations. A unified data layer is the foundation for our AI powered platform. Open, interoperable data standards, particularly through Snowflake’s robust open catalog, Iceberg tables, and data collaboration technologies, will further advance our data strategy and propel our industry.” — Tom Doyle, Chief Technology Officer, Medidata

WHOOP: “Data interoperability and flexibility are essential to delivering accurate, real-time insights to our customers. The vendor-neutral design of Apache Iceberg and Apache Polaris Catalog ensures we can seamlessly activate diverse data sources without having to copy it or get locked into a single ecosystem.” — Matt Luizzi, Senior Director of Business Analytics, WHOOP