A quick summary follows the quick editorial and after that a quick new thought: ...

jhugg · on Oct 21, 2010

Not all of the distributed NoSQL systems give up consistency. Notably, H-Base and, in certain scenarios, MongoDB, both offer consistent atomic reads and writes of one thing, be it a row, supercolumn, document or whatever.

You make a good point that NoSQL is about so much more than the CAP theorem. That doesn't mean there aren't a ton of people (some very smart) out there citing the CAP theorem as proof that you have to give up consistency to be "web-scale".

dasht · on Oct 21, 2010

Thank you. I wasn't aware of that.

Yeah, there are three things in play (I would say):

1. The language design of SQL is often icky. A different logical language could be nice.

2. The logical / physical separation is sometimes right on and sometimes... not so much. At the very least, there doesn't seem to be One True Logical Model, hence exposing a physical model with transactions seems justified.

3. Often, people are quick to give up consistency and while that is not always the wrong thing to do, it's done more than it ought to be. But... a lot of people hacking up big sites and similar these days... their standards are low and the "customer's" tolerance for resulting bugs and glitches is startlingly high (for now).

All three of those get lumped under the "NoSQL" heading and it is helpful (I think) to tease them apart.

wmf · on Oct 21, 2010

NoSQL may be bad engineering in many of its uses ... but its easier. A lot easier.

This sounds like a serious false economy. That's the story of our industry, though.

An ACID DB that gave a more simple-minded logical model than SQL ... including, sure, relaxing ACID constraints where that was really desirable ... could go a long way fixing the confusion around NoSQL.

As Stonebraker says, you can easily layer a key-value API on SQL. Or there's Berkeley DB; maybe they should rebrand it as Berkeley NoSQL.

ithkuil · on Oct 22, 2010

Berkely DB is NoSQL under some, let's say literal, definitions of NoSQL.

neo4j currently is also such a db, a non sql db, a NoSQL.

Normally, though, people expect something more from this NoSQL "movement".

People want "scaling", and elasticity.

We live in the world where today you have 2 users, tomorrow you have 2 million of users, the day after you have only 1000. We also live in a world were people are expecting everything to be working always, and are pretty pissed of if things doesn't respond in seconds (watch Luis C.K http://www.youtube.com/watch?v=8r1CZTLk-Gk)

It's not a surprise that all this hype about NoSQL came out when a number of db implementation were developed which handled replication, sharding, dynamic resizing (add remove nodes) etc

Now put a little bit aside the issue with the word "SQL" per se. Let's focus on the "partition tolerance" feature.

I was always frustrated with the fact that no matter how great my product could be, how perfect the implementations would be, how great my db would be; if my machine/rack/datacenter or section of datacenter wen't down, switches break, network connectivity goes down etc. the users of my application are not able to use it, for them it's down.

This kind of NoSQL, the one that handles partition tolerance, gives you the hope that eventually you will be able to make great software, resisting to this kind of events.

I'm not sure whether by these tools (cassandra, riak etc) I would be able to write an application that actually works better, even in the other cases. Stonebraker is right when he says that there are other more probable causes of errors, and that probably the compromises imposed by the partition tolerance will make your software development so complex you will probably make a lot of other errors and make the production unusable.

But at least there's hope that by using this tools you can make things that survive severe conditions. At least this is why I think people get's so excited about all this.

ieure · on Oct 21, 2010

Nice summary. http://bit.ly/9Pq3aE