Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Arrow is designed for in-memory processing. It can be saved on disk so you can open it directly (memory map) but it's not a great storage format. Parquet or ORC is a better choice, but they don't have as much tooling for import/export. CSV is just the simplest way to transfer data.

You might be interested in DuckDB though which trying to create a new standard for passing datasets: https://duckdb.org/



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: