Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It depends. If the data is coming from a pipe (like core_pattern) then yes you have to check for runs of zeroes. If it's coming from a filesystem, then there are various system calls that let you skip them (specifically SEEK_HOLE and SEEK_DATA flags of lseek(2)).

Also if the data is being copied into userspace anyway, then it's quite fast to check that memory is zero. There's no C "primitive" for this, but all C compilers can turn a simple loop into relatively efficient assembler[1].

If you're using an API that never copies the data into userspace and you have to read from a pipe, then yes sparse detection will be much more expensive.

In either case it should save disk space for core files which are highly sparse.

[1] https://stackoverflow.com/a/1494021



The easiest way to handle things like sparse files correctly is to invoke a program like GNU dd that already has this feature built in. GNU cp handles it, too, but it doesn't accept input from stdin.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: