Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I just tried splitting the Anna's Archive 250GB WorldCat JSON file out into separate files so I could random-access it. My app crashed at 7m files due to some corruption in the source file.

I can tell you, it took a real, real long time to delete seven million files on NTFS. It was not happy.



NTFS and Windows filesystem IO does poorly with many small files. it's one of the reasons that git for windows (and Linux subsystem version 1) is inevitably slow.


Since the file is sorted you can binary-search it: https://annas-software.org/AnnaArchivist/annas-archive/-/blo...


Looks like sorted by ID, though, sadly? :(


> NTFS was angry that day, my friends!

When it's not angry?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: