Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This could work, but then you'd need to generate 1000 queries one at a time, meaning lots and lots of disk seeks.

As long as your sort table stays in memory you are fine even if it's a lot of data, but if you hit the disk then you could be right. (But even if mysql says it hits the disk, doesn't mean it really does since the OS could cache it.)

But there are better ways of handling this. Number one being an index on that column.

You could also split the table in two, one part for sort data, the other for the large data. But I would only recommend that in very special cases, and only after careful benchmarking.

Most of the time just increase the maximum size of in-memory temporary tables.



This could work, but then you'd need to generate 1000 queries one at a time, meaning lots and lots of disk seeks.

Not with a SELECT .. WHERE id IN (x,x,x,...) type query. But, yeah, sort of. I actually did trial with separate SELECTs for every ID at first and it performed surprisingly well.

As long as your sort table stays in memory you are fine even if it's a lot of data

Trueish. At the time, though, it was a 10-12GB database on a 4GB machine and in 2006 I didn't have the cashflow to get anything better. So the most popular parts of the DB were cached OK, but there was little left for temporary tables. I say "trueish", though, because you're still using more memory bandwidth with temporary tables in this situation and that's an issue sometimes.

Number one being an index on that column.

The problem was that there were about 8 metadata columns and sorts were done with varying combinations of these (though usually one). Indexes became a big problem at a certain point because of how difficult they made it to clear stale data from the database (at one point we were doing 6 hour maintenance periods each month - with caches picking up the slack - to delete old records).

If I were doing it again, I'd do a lot differently (and not use MySQL), of course.. so your suggestions are certainly useful to anyone tackling a similar problem with 2010 eyes.


What would you use instead of MySQL?


MongoDB is a running joke with you but, seriously, probably MongoDB along with Redis. It wouldn't be using the same algorithms as it did on MySQL, though. There are better ways to structure that sort of data nowadays.


It's funny that it comes off as a joke since I really do love it. I mean, it powers http://nyhtp.com. I've had nothing but great experience with Redis, mixed experience with tokyo cabinet and tyrant, and not enough real work experience with Mongo.


MongoDB + Redis are proving a winning combination for me so far. Or, rather, the combination of having a fast key/value store with ephemeral key support and a document database is a winner. It gives you two levels and more immediate optimization and separate of concerns opportunities.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: