Columnar storage systems rarely store the raw value at fixed position. They stor...

lmeyerov · 2025-02-25T17:14:05 1740503645

Also typically stored with binary compression (snappy, lib) after the snappy compression. In-memory might only be semantic, eg, arrow.

But it's... Fine? Batch writes and rewrite dirty parts. Most of our cases are either appending events, or enriching with new columns, which can be modeled columnarly. It is a bit more painful in GPU land bc we like big chunks (250MB-1GB) for saturating reads, but CPU land is generally fine for us.

We have been eyeing iceberg and friends as a way to automate that, so I've been curious how much of the optimization, if any, they take for us