This could work, but then you'd need to generate 1000 queries one at a time, mea...

petercooper · on March 24, 2010

This could work, but then you'd need to generate 1000 queries one at a time, meaning lots and lots of disk seeks.

Not with a SELECT .. WHERE id IN (x,x,x,...) type query. But, yeah, sort of. I actually did trial with separate SELECTs for every ID at first and it performed surprisingly well.

As long as your sort table stays in memory you are fine even if it's a lot of data

Trueish. At the time, though, it was a 10-12GB database on a 4GB machine and in 2006 I didn't have the cashflow to get anything better. So the most popular parts of the DB were cached OK, but there was little left for temporary tables. I say "trueish", though, because you're still using more memory bandwidth with temporary tables in this situation and that's an issue sometimes.

Number one being an index on that column.

The problem was that there were about 8 metadata columns and sorts were done with varying combinations of these (though usually one). Indexes became a big problem at a certain point because of how difficult they made it to clear stale data from the database (at one point we were doing 6 hour maintenance periods each month - with caches picking up the slack - to delete old records).

If I were doing it again, I'd do a lot differently (and not use MySQL), of course.. so your suggestions are certainly useful to anyone tackling a similar problem with 2010 eyes.

jseifer · on March 24, 2010

What would you use instead of MySQL?

petercooper · on March 24, 2010

MongoDB is a running joke with you but, seriously, probably MongoDB along with Redis. It wouldn't be using the same algorithms as it did on MySQL, though. There are better ways to structure that sort of data nowadays.

jseifer · on March 24, 2010

It's funny that it comes off as a joke since I really do love it. I mean, it powers http://nyhtp.com. I've had nothing but great experience with Redis, mixed experience with tokyo cabinet and tyrant, and not enough real work experience with Mongo.

petercooper · on March 24, 2010

MongoDB + Redis are proving a winning combination for me so far. Or, rather, the combination of having a fast key/value store with ephemeral key support and a document database is a winner. It gives you two levels and more immediate optimization and separate of concerns opportunities.