Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

While there have been some truly misguided mapreduce implementations, mapreduce is just a computation model that isn't inherently slower than others: A relational aggregation of the type you get with SQL like:

  select foo, count(*) from bar group by foo
...is essentially a mapreduce, although most databases probably don't use a reduce buffer larger than 2. (But they would benefit from it if they could use hardware vectorization, I believe.)

Mapreduce works great if you are already sequentially churning through a large subset of a table, which is typically the case with aggregations such as "count" and "sum". Where mapreduce is foolish is when you try using mapreduce for real-time queries that only seek to extract a tiny subset of the dataset.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: