Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My idea is a different hash construction, which is 2D construction, which has a infinite internal state and infinite output length. Each row and each column also has a sequence number input, and then there are two mixing up functions (each of which has a finite input and output of equal size than the input); part of the result is propagated horizontally and part of it vertically, so each cell has two inputs and the input includes both the message block and the sequence number, which overwrites a part of the internal state (the rest of which has been mixed with the other state). Each output block (the output blocks are vertical and the message blocks are horizontal, which is why it is slow) is then truncated and concatenated to produce the final output, after adding padding at the end of the message (including the length of the message) to mix it up even more. If the message length is m and the hash length is h, then it requires O(h) space and O(mh) time. If ChaCha20 is in use, then it is easy to see (by the kind of calculations made by ChaCha20) that if the input is zero then the output will also be zero; therefore, the initial sequence number should be 1 and not 0. You can also compute a secondary hash which must be misaligned with the first one, so that in case a collision is found in one block that the collision will not be propagated with the other blocks too (unless a collision can be found within two consecutive blocks, in which case you would hope that the extra dimension (since it is a 2D construction) can avoid the collision).


Reading your post I’m sure you know far more about information theory than I do, so forgive me if this is a stupid question, but…

1. How is infinite internal state and output size possible? There has to be some actual limit for internal state, at least, right? Or else you’ll just run out of memory?

2. Wouldn’t a larger output size risk leaking data? The chance of collision becomes lower, but it also seems to toy with what I understand about why we use cryptographic hashes without ridiculous output size.

3A. What happens if you want a 1 MB hash of a 1 KB file?

3B. How predictable does a super long hash become?

At what point is it no longer a one-way hash?


1. Because you will probably be truncating the output to a short finite hash, and because each block of internal state only leads in one direction (it affects all later ("up") blocks of internal state for all later ("right") message blocks, but not previous ("left" and "down")), you can optimize it by only implementing the part that you need. So, yes, an actual implementation will be limited, but if you have enough memory and enough time then the limit can be as high as you want to be. (Maybe a diagram might explain it better; I am not sure that this explanation is any good.) (SHAKE also has infinite output (but finite internal state), and when using SHAKE also you would truncate the output to a finite size.)

2. A larger output size might risk leaking data, although I would think it would be difficult.

3A. Like I described, it then needs O(1MB) space and O(1GB) time, so it will be slow. However, I do not expect you should need a hash that long.

3B. I don't know; probably about as predictable as any random number generator, if the hash is designed correctly. (I only describe a construction, and the hash algorithm design involves more than that.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: