"Schema-less" has the potential (if you use it properly) advantage of allowing gradual migration.
As long as your code can handle all versions of objects in current use, you can deploy new code, then either migrate objects as they're updated/rewritten, and/or slowly migrate objects in the background.
For certain types of schema changes in large enough data stores, this can be a killer feature. I remember one RDBMS setup I had to deal with where we were "stuck" having to do a lot of suboptimal schema changes because the changes we actually wanted to do resulted (based on tests in our dev environment) the system to slow to a crawl where it was unusable for 8+ hours and we just couldn't afford that kind of downtime. We spent a lot of engineering time working our way around something that'd simply be a non-issue in a schema-less system.
"Schemaless" most of the time means "code based schema". Dealing with multiple schema versions at the same time is always possible, relational or not, but it causes significant bloat and complexity. When I hear gradual migration I think code decay, but I can see why it could be useful sometimes.
In my view, schemaless models are only desirable if the schema is not known until runtime, e.g. user specified fields or message structures, external file formats that you don't control but might need to query, etc.
Well, also there is the issue of highly unstructured data. In LedgerSMB, we put it in PostgreSQL along with highly structured data, and just use key-value modelling. These include things like configuration settings for the database in question and the specifics about what a menu item does. I might migrate some of this to hstore in the future (particular the menus).
There are many shortcomings of this approach but when dealing with highly unstructured data (or basically where the inherent structure is that of key/value pairs) it strikes me as the correct approach, and not different really from using NoSQL, XML, or any other non-relational store.
right but was that MySQL ? schema migrations are not a problem on quality systems like Oracle and Postgresql. Altering tables and such doesn't stop the database from running at all.
Not indexably. But you can do a hideous many-tables-per-real-table thing where each field gets a tall thin table in PostGRES or MySQL, do a lot of joins to get your data, and index the fields in that.
It's not as awful as it sounds, performance-wise. It is as awful as it sounds in terms of maintainability, of course.
Even more common is when you have a mature application with a lot of users and you need to add new fields to f.ex the user table and you can't because alter table across a sharded db setup will take days or weeks so you end up creating a table that's a hashtable
key, value
and then proceed to pay the cost of joins against it. Most of my excitement around NoSql comes from hard earned pain not from "oh new shiny thing, I got to use it".
I'll take well-understood pain that I can patiently work around, one time, over the course of days or weeks, if the alternative is random bugs that bite you in the night for years at a time.
Joins are no fun, yes, but as you gritted your teeth and implemented those cute little table-based key-value stores, did you find yourself mentally calculating the time required to restore the whole system from backup while muttering tiny prayers? Probably not. Did your code wake up the ops team an average of once per month for several years? Did you lose data? Did you have to put up an apologetic blog post? Did anyone have to get on the phone and rescue customer accounts, one at a time, with profuse apologies and gifts? (Now that is a non-scalable process...)
But at least this argument about maintenance is a real argument. The one about wanting to save time during initial development by skipping the declaration of schemas reads like the punchline of a Dilbert cartoon that you'd find taped to the wall in the devops lunchroom.
@mechanical_fish yes and it was a mysql installation. Weird things happen with all systems once you push them up to the edge of performance both of the hardware and interconnections between servers.
Slow interconnect between servers caused me headaches in the past with mysql for replication. Shared switched did the same. Problems with locks under high contention did the same. Problems with the client libraries the same. In fact all storage systems have similar problems and pain. Some are just more battle tested than others.
As long as your code can handle all versions of objects in current use, you can deploy new code, then either migrate objects as they're updated/rewritten, and/or slowly migrate objects in the background.
For certain types of schema changes in large enough data stores, this can be a killer feature. I remember one RDBMS setup I had to deal with where we were "stuck" having to do a lot of suboptimal schema changes because the changes we actually wanted to do resulted (based on tests in our dev environment) the system to slow to a crawl where it was unusable for 8+ hours and we just couldn't afford that kind of downtime. We spent a lot of engineering time working our way around something that'd simply be a non-issue in a schema-less system.