Do you have any recommended config for speeding up tests that hit Postgres? We c...

anarazel · on Feb 2, 2020

If you're truncating/dropping tables all the the time, it can actually be beneficial to set shared_buffers to a pretty small value. When dropping/truncating, postgres needs to scan the buffer cache to get rid of all entries belonging to that relation (happens at the end of the transaction). The mapping is a hashtable, so there's no ordered scan to do so in a nice way.

Also, do the truncations all in a single transaction. Since the actual dropping/truncations only happens after commit (for obvious reasons), we've at least optimized multiple truncations to happen in one pass.

It's unfortunate that a larger shared buffers can hurt this way :(

I've long hoped to find the cycles to finish a prototype that converted the buffer mapping to a hash table (for relations) over radix trees (or blocks in the relation).

MaxGabriel · on Feb 3, 2020

Nice, that's a good tip. It's a little hard to test small speedups like this, but I used https://hackage.haskell.org/package/bench and got these results:

128MB ️ ~/D/M/h/mercury-web-backend> caffeinate bench "make test" 15:19:04 benchmarking make test time 226.4 s (NaN s .. 240.6 s) 1.000 R² (0.999 R² .. 1.000 R²) mean 222.3 s (220.6 s .. 224.4 s) std dev 2.390 s (104.1 ms .. 2.986 s) variance introduced by outliers: 19% (moderately inflated)

128MB ️ ~/D/M/h/mercury-web-backend> caffeinate bench "make test" 16:18:57 benchmarking make test time 225.8 s (208.8 s .. 233.9 s) 0.999 R² (0.999 R² .. 1.000 R²) mean 221.6 s (215.8 s .. 224.8 s) std dev 5.576 s (1.671 s .. 7.491 s) variance introduced by outliers: 19% (moderately inflated)

2MB ️ ~/D/M/h/mercury-web-backend> caffeinate bench "make test" 18:30:27 benchmarking make test time 216.9 s (204.5 s .. 229.5 s) 1.000 R² (0.998 R² .. 1.000 R²) mean 219.7 s (216.7 s .. 221.3 s) std dev 2.863 s (1.283 s .. 3.658 s) variance introduced by outliers: 19% (moderately inflated)

2MB ️ ~/D/M/h/mercury-web-backend> caffeinate bench "make test" 19:29:26 benchmarking make test time 209.6 s (192.8 s .. 224.1 s) 0.999 R² (0.999 R² .. 1.000 R²) mean 216.0 s (211.0 s .. 218.5 s) std dev 3.732 s (1.442 s .. 5.168 s) variance introduced by outliers: 19% (moderately inflated)

So it seems like 2MB shared buffers shaves a few seconds off our tests. Is 2MB in the range you were thinking of?

anarazel · on Feb 3, 2020

> So it seems like 2MB shared buffers shaves a few seconds off our tests. Is 2MB in the range you were thinking of?

Well, hard to say without knowing anything about your tests. I think the cases where I've seen it make a really big difference were more ones where people upped and upped shared_buffers trying to improve test performance, without noticing that it made things worse.

I'd probably not go below 16MB, otherwise it's plausible you could run into some errors (e.g. if you have a large number of cursors or such - they can end up pinning a few buffers, and if there's only 256 buffers, you don't need to be too creative for that).

1996 · on Feb 2, 2020

You really don't want to do that, because the test environment will not be representative of the real environment (unless your production db has fsync=off etc., which can only be justified in very specific cases)

MaxGabriel · on Feb 2, 2020

I’m willing to accept less-than-perfect simulation of our production environment in order to speed up development, especially for settings that eg only matter in the case where the machine crashes (big deal in production—not in tests).

We already make a leap of faith by developing on Mac/Linux distros other than what we use in production. Making Postgres not-crash safe locally is a pretty small change compared to that.

anarazel · on Feb 2, 2020

One caveat: Disabling fsyncs can hide problematic access patterns. If you were to e.g. accidentally perform a large set of DML in one-row increments and forgot to do so in one transaction, you'd not see a huge penalty in your test environment, but a lot more in production (one fsync for each transaction in the non-concurrent case).

Btw, disabling synchronous_commit isn't going to do much once you've disabled fsyncs. What s_c=off basically does is to defer the journal flush to a background process that will do them regularly (unless already done by another session / reason) - but if fsync=off, that's not necessary anyway. And s_c=off does have some small overhead too.

NewJazz · on Feb 2, 2020

You can have two pipelines, one for typical development / merge requests and another for pre-releases.

AlphaSite · on Feb 2, 2020

I think this dev machine vs CI?