I wrote the Lanczos3 scaler that Picasa uses (about 14 years ago!), and it doesn't (by default) correct for gamma.
For most images & kernels, you can get by with doing math in about 31 bits total precision (a bit for underflow clamping), so that's the magic speed improvement from ignoring gamma on old 32-bit architectures.
If you gamma-correct, your sources will need to be 10-12 bpc after transformation, and you'll need >32 bits of integer precision for large kernels. Even though x86 has a 32x32 -> 32bit high word multiply, this all gets a lot better on x64.
So basically: today on 64-bit machines (or wide SIMD) you could do gamma-correct resampling a whole lot cheaper than you could do 10 years ago.
[edit] In the end you notice the difference with line art, but for most photographs you want to spend spend your CPU budget on sharpness (by using wider kernels).
For most images & kernels, you can get by with doing math in about 31 bits total precision (a bit for underflow clamping), so that's the magic speed improvement from ignoring gamma on old 32-bit architectures.
If you gamma-correct, your sources will need to be 10-12 bpc after transformation, and you'll need >32 bits of integer precision for large kernels. Even though x86 has a 32x32 -> 32bit high word multiply, this all gets a lot better on x64.
So basically: today on 64-bit machines (or wide SIMD) you could do gamma-correct resampling a whole lot cheaper than you could do 10 years ago.
[edit] In the end you notice the difference with line art, but for most photographs you want to spend spend your CPU budget on sharpness (by using wider kernels).