Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There is no block to block prediction other than DC prediction, so is this effect due to your distortion function spanning multiple blocks? Same for cross YUV channels, because your metric is in RGB space?

edit: second read-through I found the paper [1] which explains it. The answer is basically "yes", where the large scale distortion function is basically activity masking. Normally this would be implemented with delta-QPs, but because JPEG doesn't have that, Guetzli uses runs of zeroes instead.

[1] https://arxiv.org/pdf/1703.04421



This comes through the internal use of butteraugli -- and depending the quantization decisions on butteraugli.

Butteraugli uses a 8x8 FFT, but computes this every 3x3 pixel creating coverage at block boundaries. In later stages of butteraugli calculation values are aggregated from an even larger area. Block boundary artefacts are taken into account by this and impact quantization decisions.

Butteraugli operates neither in RGB nor YUV. It has a new color space that is a hybrid of tri-chromatic colors and opponent colors. Black-to-yellow and red-to-green are opponent, but blue is modeled closer to tri-chromatic. In more simple explanation it is possible to think of it as follows: first apply inverse gamma correction, second apply a 3x4 transform for rgb, third apply gamma correction, fourth calculate r - g, r + g and keep blue separate.


Do you have / plan a paper describing butteraugli itself?

It seems like that's where most of the magic lies. Also peculiarities of human vision are one of my oddball interests, after compression of course. :)


+1 I would be very interested in reading about butteraugli, if there is anything documented.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: