> Personally, it took me ages to figure out how to losslessly recompress a monoc...

jml7c5 · on Nov 10, 2019

I can't recall exactly what steps I went through. It took quite a lot of time. It was one of those rabbit holes where I spend two hours thinking I'm just 10 minutes from being done. If I recall correctly, I did the conversion in hopes of getting my aged Chromebook and phone to load the PDF more quickly; scrolling and page turns were just choppy and slow enough to be an annoyance.

As to the conversion itself: I think every page was just an image (which was fortunate, as I'm not sure how I would do this otherwise) so I extracted all the images with something like mutool, converted them using jbig2enc after figuring out the flags to use to ensure lossless compression, then use some tool (ghostscript? imagemagick? some dedicated image-to-pdf program?) to smush it all back together.

So the flow was original.pdf -> [page1.png, page2.png, ...] -> [page1.jb2, page2.jb2, ...] -> new.pdf

I can't recall the original format of the images. Obviously it was something less efficient, but it was lossless. Presumably the images were compressed using the less efficient scheme because whoever scanned the pages didn't know they could twiddle some knob on whatever scanner or scanning software they were using. Or there was no knob for them to twiddle.