Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How much entropy does the "data about a volume in space" have?

And is this lossless or lossy compression?

Here's a stack exchange question with a good answer (http://stackoverflow.com/questions/4716116/what-is-the-best-...)

> If file sizes could be specified accurate to the bit, for any file size N, there would be precisely 2^(N+1)-1 possible files of N bits or smaller. In order for a file of size X to be mapped to some smaller size Y, some file of size Y or smaller must be mapped to a file of size X or larger. The only way lossless compression can work is if some possible files can be identified as being more probable than others; in that scenario, the likely files will be shrunk and the unlikely ones will grow.

As a simple example, suppose that one wishes to store losslessly a file in which the bits are random and independent, but instead of 50% of the bits being set, only 33% are. One could compress such a file by taking each pair of bits and writing "0" if both bits were clear, "10" if the first bit was set and the second one not, "110" if the second was set and the first not, or "111" if both bits were set. The effect would be that each pair of bits would become one bit 44% of the time, two bits 22% of the time, and three bits 33% of the time. While some strings of data would grow, others would shrink; the ones that shrank would--if the probability distribution was as expected--outnumber those that grow (4/9 files would shrink by a bit, 2/9 would stay the same, and 3/9 would grow).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: