"Schema-less" has the potential (if you use it properly) advantage of allowing g...

fauigerzigerk · on Nov 6, 2011

"Schemaless" most of the time means "code based schema". Dealing with multiple schema versions at the same time is always possible, relational or not, but it causes significant bloat and complexity. When I hear gradual migration I think code decay, but I can see why it could be useful sometimes.

In my view, schemaless models are only desirable if the schema is not known until runtime, e.g. user specified fields or message structures, external file formats that you don't control but might need to query, etc.

einhverfr · on Nov 6, 2011

Well, also there is the issue of highly unstructured data. In LedgerSMB, we put it in PostgreSQL along with highly structured data, and just use key-value modelling. These include things like configuration settings for the database in question and the specifics about what a menu item does. I might migrate some of this to hstore in the future (particular the menus).

There are many shortcomings of this approach but when dealing with highly unstructured data (or basically where the inherent structure is that of key/value pairs) it strikes me as the correct approach, and not different really from using NoSQL, XML, or any other non-relational store.

zzzeek · on Nov 6, 2011

right but was that MySQL ? schema migrations are not a problem on quality systems like Oracle and Postgresql. Altering tables and such doesn't stop the database from running at all.

it's always MySQL's fault in these things.

goldmab · on Nov 6, 2011

Fair enough, but you can also have a schemaless store by using JSON fields in PostgreSQL or MySQL.

angelbob · on Nov 6, 2011

Not indexably. But you can do a hideous many-tables-per-real-table thing where each field gets a tall thin table in PostGRES or MySQL, do a lot of joins to get your data, and index the fields in that.

It's not as awful as it sounds, performance-wise. It is as awful as it sounds in terms of maintainability, of course.

joevandyk · on Nov 6, 2011

You can index hstore fields in PostgreSQL.

deoxxa · on Nov 6, 2011

That's not an unfair comparison at all - indexing the data in a JSON blob is entirely possible and practical.

einhverfr · on Nov 6, 2011

What you want to index regarding a large text file and what indexes you can create and use may be different.

christkv · on Nov 6, 2011

Even more common is when you have a mature application with a lot of users and you need to add new fields to f.ex the user table and you can't because alter table across a sharded db setup will take days or weeks so you end up creating a table that's a hashtable

key, value

and then proceed to pay the cost of joins against it. Most of my excitement around NoSql comes from hard earned pain not from "oh new shiny thing, I got to use it".

mechanical_fish · on Nov 6, 2011

I'll take well-understood pain that I can patiently work around, one time, over the course of days or weeks, if the alternative is random bugs that bite you in the night for years at a time.

Joins are no fun, yes, but as you gritted your teeth and implemented those cute little table-based key-value stores, did you find yourself mentally calculating the time required to restore the whole system from backup while muttering tiny prayers? Probably not. Did your code wake up the ops team an average of once per month for several years? Did you lose data? Did you have to put up an apologetic blog post? Did anyone have to get on the phone and rescue customer accounts, one at a time, with profuse apologies and gifts? (Now that is a non-scalable process...)

But at least this argument about maintenance is a real argument. The one about wanting to save time during initial development by skipping the declaration of schemas reads like the punchline of a Dilbert cartoon that you'd find taped to the wall in the devops lunchroom.

christkv · on Nov 6, 2011

@mechanical_fish yes and it was a mysql installation. Weird things happen with all systems once you push them up to the edge of performance both of the hardware and interconnections between servers.

Slow interconnect between servers caused me headaches in the past with mysql for replication. Shared switched did the same. Problems with locks under high contention did the same. Problems with the client libraries the same. In fact all storage systems have similar problems and pain. Some are just more battle tested than others.