Iceberg updates
Apache Iceberg 1.3.1 was released on July 25, 2023. The 1.3.1 release addresses various issues identified in the 1.3.0 release.
- Core
- Table Metadata parser now accepts null for fields: current-snapshot-id, properties, and snapshots (#8064)
- Hive
- Fix HiveCatalog deleting metadata on failures in checking lock status (#7931)
- Spark
- Fix RewritePositionDeleteFiles failure for certain partition types (#8059)
- Fix RewriteDataFiles concurrency edge-case on commit timeouts (#7933)
- Fix partition-level DELETE operations for WAP branches (#7900)
- Flink
- FlinkCatalog creation no longer creates the default database (#8039)
Other noteworthy items that the community added to Iceberg are shown below and will be part of the next Iceberg 1.4.0 release:
- Added registerTable to the REST Catalog
- FLIP-27 Flink source switched to JSON parser for FileScanTask
- Fixed an issue with WAP branches and deletes
- View APIs and the View Spec have been updated
- OAuth2 Token support was added for GCSFileIO
- An issue with single-byte reads with GCSFileIO was fixed
- Improvements to CREATE OR REPLACE Branch/Tag
- Fixed unicode handling in HTTP client
PyIceberg updates
PyIceberg 0.4.0 was released, and a blog is available that dives into the significant changes.
The 0.4.0 took a while, but the next release will be soon. If you’re missing anything in 0.4.0, please reach out on Github by raising an issue, or feel free to reach out on the community Slack in the #python channel.
PyIceberg 0.5.0 already has some awesome features lined up:
- AWS Lambda compatibility
- SqlCatalog support (JDBCCatalog in Java)
- Major improvements on the speed of the Avro parser
- GCS Support
Also, the first building blocks for the write path are being worked on (Avro writers, collection of metrics), but it requires some thorough testing, so that might be part of a future version.
More information can be found on the project site, and the package is available on PyPI.
Rust and Go
If you’re excited about non-JVM implementations of Iceberg, there is now a #rust and a #go channel on the community Slack. The Golang implementation is underway; please check it out and get involved. If you’re interested in the Rust implementation, please follow the GitHub repository. Join the community Slack to contribute and stay up to date on developments.
Iceberg in the industry
- Snowflake — What’s New: Apache Iceberg With Snowflake — Snowflake Summit 2023
- DuckDB — Initial experimental support of Iceberg
- Amazon — Amazon Redshift now supports querying Apache Iceberg tables
- CelerData — StarRocks now supports read/write to Iceberg tables
- GlareDB — Adding Iceberg support
- Netezza — Netezza’s Evolution From Warehouse to Lakehouse With Watsonx.data
- Monte Carlo — Iceberg, Right Ahead! 7 Apache Iceberg Best Practices for Smooth Data Sailing
- Anton Okolnychyi and Chao Sun — Eliminating Shuffles in Delete Update, and Merge (video)
- stLGo — Cliff Gilmore presents Streaming Ingestion into Apache Iceberg (video)
Blogs from the community
- Amazon — Backtesting index rebalancing arbitrage with Amazon EMR and Apache Iceberg
- Amazon — Choosing an open table format for your transactional data lake on AWS
- Tabular — The CDC MERGE Pattern
- Tabular — Iceberg in Modern Data Architecture
- Cloudera — 12 Times Faster Query Planning With Iceberg Manifest Caching in Impala
- Areca Data — Streaming Data Lakehouse Foundations: Powering Real-Time Insights with Kafka, Flink, and Iceberg
- Ayush Saxena — Apache Hive 4.x With Apache Iceberg (Part-I)
- David Jayatillake — The Modern Data Stack is Dead, Long Live the Modern Data Stack — Part 2
- Amazon — Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics
Iceberg in the news
- Select.dev: Every Major Announcement at Snowflake Summit 2023 and 1 Word Never Mentioned
- The Next Platform: ENTERPRISES ARE NOT GOING TO MISS THE FOURTH WAVE OF AI
Keep up to date on all things iceberg
Watch for new videos on the Iceberg YouTube Channel
Read blog posts added to the Blogs page
See the community Contribute guide to learn how to start contributing to Iceberg
Join the Apache Iceberg workspace on Slack using the invite link
Subscribe to the Apache Iceberg mailing list