Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To quote from the report: "Moreover, the snapshot read concern did not guarantee snapshot unless paired with write concern majority—even for read-only transactions."

Of course, it doesn't work when you don't pair it with majority read/write concern. You can't expect to get a snapshot of data that wasn't yet acknowledged by majority of the cluster.

As to the quote you probably are referring to:

"Jepsen evaluated MongoDB version 4.2.6, and found that even at the strongest levels of read and write concern, it failed to preserve snapshot isolation."

I did not find any proof of this in the rest of the report. It seems this is mostly complaint of what happens when you mix different read and write concerns.

I would also suggest to think a little bit on the concept of snapshot in the context of distributed system. It is not possible to have the same kind of snapshot that you would get with a single-node application with the architecture of MongoDB. MongoDB is a distributed system where you will get different results depending on which node you are asking.

The only way you could get close to having a global snapshot is if all nodes agreed on a single truth (for example single log file, block chain, etc.) which would preclude read/write with concern level less than majority.



Did you see the part about "Operations in a transaction use the transaction-level read concern. That is, any read concern set at the collection and database level is ignored inside the transaction."?

"Tansactions without an explicit read concern downgrade any requested read concern at the database or collection level to a default level of local, which offers “no guarantee that the data has been written to a majority of replicas (i.e. may be rolled back).”"

The big problem is that, even if somebody correctly sets the read and write concerns to something sensible, the moment they use a transaction these guarantees fly out the window, unless they read the docs carefully enough to realise they have to set the read and write concern for the transaction too. The defaults are very un-intuitive; I can't imagine that the case of somebody needing snapshot isolation in general but being fine with arbitrary data less in transactions is a common case, compared to wanting to avoid data loss both generally and in transactions.


It is different to complain about unclear documentation and unintuitive gurantees and to say that it just doesn't work.

Yes it works. Yes, you have to read the documentation very carefully.


Not saying you're wrong. As an anecdotal data point - we've read the docs (carefully) and spoke to MongoDB quite a bit when implementing transactions including their highest paid levels of support and still ran into this issue:

> transactions running with the strongest isolation levels can exhibit G1c: cyclic information flow.

As well as the Node.js API issue (I just checked randomly and their Python API has the same bug lol) listed above.


It is not different. For a product like mongodb, both the durability guarantees and the documentation explaining them are an integral part of the user experience. If I'm starting a project, I'm making decisions for a junior developer whom I'll hire in 2 years. I care what code that junior developer will be most nudged to write.

If the Stripe API had documentation was needlessly unclear in a way which led people to lose a significant amount of money, that would be a bug.


> I did not find any proof of this in the rest of the report.

May I suggest sections 3.4, 3.5, 3.6, 3.7, 4.0, and 4.1?


Quoting half the report is bad for the discussion as it makes it impossible for the reader to follow.


Chief, it does not have to be this hard. 3.4 clearly states:

This anomaly occurred even with read concern snapshot and write concern majority

3.5: In this case, a test running with read concern snapshot and write concern majority executed a trio of transactions with the following dependency graph

3.6: Worse yet, transactions running with the strongest isolation levels can exhibit G1c: cyclic information flow.

3.7: It’s even possible for a single transaction to observe its own future effects. In this test run, four transactions, all executed at read concern snapshot and write concern majority, append 1, 2, 3, and 4 to key 586—but the transaction which wrote 1 observed [1 2 3 4] before it appended 1.

Like... if you had read any of these sections--or even their very first sentences--you wouldn't be in this position. They're also summarized both in the abstract and discussion sections, in case you skipped the results.

4.0: Finally, even with the strongest levels of read and write concern for both single-document and transactional operations, we observed cases of G-single (read skew), G1c (cyclic information flow), duplicated writes, and a sort of retrocausal internal consistency anomaly: within a single transaction, reads could observe that transaction’s own writes from the future. MongoDB appears to allow transactions to both observe and not observe prior transactions, and to observe one another’s writes. A single write could be applied multiple times, suggesting an error in MongoDB’s automatic retry mechanism. All of these behaviors are incompatible with MongoDB’s claims of snapshot isolation.

It's OK to stop digging now!


May I suggest alternative perspective on the matter?

Compared to a product like Oracle, transactions on MongoDB are very new, very niche functionality. Even MongoDB consultants do openly suggest not to use it.

MongoDB is really meant to store and retrieve documents. That's where the majority read/write concern guarantees come from.

As long as you are storing and retrieving documents you are pretty safe functionality.

Your article presents the situation as if MongoDB did not work correctly at all. That is simply not true, the most you can say is that a single (niche) feature doesn't work.

Have you ever tried distributed transactions with relational databases? Everybody knows these exist but nobody with sound mind would ever architect their application to rely on it.

Any person with a bit of experience will understand that things don't come free and some things are just too good to be true. MongoDB marketing may be a bit trigger happy with their advertisements but it does not mean the product is unusable, they just probably promised bit too much.


Have you ever tried distributed transactions with relational databases?

I am delighted to say that yes: checking safety properties of distributed systems, including those of relational databases, is literally my job. See https://jepsen.io/analyses for a comprehensive list of prior work, or http://jepsen.io/analyses/tidb-2.1.7, http://jepsen.io/analyses/yugabyte-db-1.1.9, http://jepsen.io/analyses/yugabyte-db-1.3.1, or http://jepsen.io/analyses/voltdb-6-3 for recent examples of Jepsen analyses on relational databases.


This comment will rightfully be downvoted, but I'm going to break HN decorum for once in my long posting history here and simply say:

Holy shit, buddy. Stop.


The world does not revolve around HN votes. If your first urge is whether the post gets downvoted or not you might want to rethink your life a little bit.

So don't worry about me.


I'm not "worried" nor experiencing an "urge." Please skip the concern trolling.

What I do have an interest in is HN's accepted decorum, which I admittedly stepped outside of when I implored you to stop digging yourself such a hole.

HN is far from perfect but there is a culture of respectful discourse here, which is part of the reason for its value IMO.


Please stop. We don't want flamewars here.


May I suggest the tiniest bit of consideration (such as reading the report) before jumping to conclusions and low-key offending the author? You should be embarrassed.


This comment looks a bit comical when compared with the one you started this whole thread with. You're an engineer, why are you siding with marketing over measured technical facts? Do you think denial will make your infrastructure any safer? Don't make excuses for MongoDB, just acknowledge the article as an appropriately well weighted response to their marketing claims and move on.


You may want to delete this comment too.


Jesus christ @-@


> May I suggest alternative perspective on the matter?

Can't reply to that since it's too nested so I'll reply here. I warmly recommend getting off tree you climbed on and actually reading the article because if you do - you will see you are not disagreeing on that part.

The article is a mostly technical analysis of the transaction isolation levels and where they hold. The main criticism is how MongoDB advertises itself. If they didn't claim the database is "fully ACID" then the article would have just been a technical analysis :]




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: