Use StockFish to predict the next move, and only store the diff between the actual move and the prediction.
Or, better, get a ranked list of all possible moves from stockfish, and use a variable-length integer to encode the position in the list. Then the best move takes ~1 bit, and worse moves take more bits. (And we can do fun things like compute how bad a player is by how big their game is after compression.)
The more advanced your predictor is, the slower your compressor gets. OP has 600 Million moves to encode, how long does it take to ask Stockfish for its opinion on 600M board states? (and then again, every time you want to decompress) (not a rhetorical question btw, I know little about chess engines)
I suspect the sweet spot here would be to use a much worse chess engine for the predictions, giving faster compression/decompression at the expense of the compression ratio.
It looks like a very interesting comparison. I still not sure how to tranform StockFish points to probabilities. (Perhaps proportional to exp(-difference/temperature), where temperature is a magical number or perhaps it's like something/elo??? There are too many parameters to tweak.)
I was imagining downloading the lichess database and trying different strategies to compress it, It's too much work for me, but I'd love to read a blog post if somsome does that.
> Use StockFish to predict the next move, and only store the diff between the actual move and the prediction.
This ties the algorithm down to one specific version of Stockfish, and configured identically (stuff like the hashtable size etc.), because all such factors will have an impact on Stockfish's evaluations. One factor changes, and you can't decompress the backup.
Or, better, get a ranked list of all possible moves from stockfish, and use a variable-length integer to encode the position in the list. Then the best move takes ~1 bit, and worse moves take more bits. (And we can do fun things like compute how bad a player is by how big their game is after compression.)