This talk is an introduction to the Iceberg table format. It's meant as an examination of why table formats like Iceberg came to be, a deep dive into how Iceberg works, and a tour of some of its more advanced features. It also covers the necessary care and feeding of Iceberg tables, and some of its currently-rough edges, informed by multiple years of running Iceberg in production with clients.
In this talk, we'll cover:
* The origins of Iceberg, and the motivation for table formats generally
* How Iceberg works internally, with an emphasis on how it provides transactional semantics on top of object storage
* Some of the fancier features of Iceberg, including branching, time-travel and the write-audit-publish pattern
* Compaction, garbage collection, and the small files problem in Iceberg.
* The importance of data catalogs when working with Iceberg.
I hope that the audience will come away knowing more about Iceberg than they did before, and having a better idea of whether Iceberg is a good fit for their systems.



