One idea me and my brother messed about with was "Merkel DAGs". The place we worked at had, what amounted to, a really fancy workflow/flowchart designer and the question was always how to make it collaborative (because that would sell like hotcakes on a slide deck, irrespective of whether people used it). Another concern was versioning, and Git is a Merkel DAG; great problems think alike.
We never finished it because of some truly difficult questions, and decided that failure was the most reasonable result. There were some truly awesome things though. For example, if two users independently connected the same two flowchart steps with the same two lines, those two changes would be understood as identical by the system (by nature/emergence). It was also very granular snapshot consistency, which is incredibly simple to reason about.
The other cool thing was that CAS (again, a Git concept) is absurdly cache friendly, and all forms of fancy stuff can be done there.
We ultimately failed it because we had messed up somewhere and wound up with tons of borderline cases (and the hope was that we wouldn't need them). I still think, given the absence of our mistakes, it could work and would be really elegant code.
RON progressed over these three years. RON docs now sit in a RON-based revision control system coupled with a RON based wiki http://doc.replicated.cc/^Wiki/ron.sm
This is brilliant. One of the best things I've read in a while.
I find it striking that the author's 7 rules (at about 3/4 of the scrollbar on desktop) read much like a manifest on functional programming. In particular:
> The operations are the data.
That strikes me as a sword to the Gordian Knot of distributed state. Data isn't some mutable black box you poke mutations into and prod and read. Data is the set of operations used to construct it. This feels to me like the principle underlying FP in general, event sourcing, reactor pattern, and of course these RDTs. Of course, this implies a chunk of Data is the causal history of the data (which in the physical world is strictly true).
C'est ne pas un byte[].
The fun part comes in all the different ways to optimize going from the territory (all the operations on a data blob) to the map (the "realized" data blob, or view into the system).
This also has flavors of quantum mechanics to me. Causality is anything you can get away with mutating before a meaningful observation occurs.
Let's say you have a causal tree representing an output sequence, and you are interested in running an arbitrary set of finite state machines over this sequence. What is an efficient approach to rewinding and replaying those state machines in the face of updates coming into the sequence? What if those state machines themselves emit (and might want to retract) operations on the underlying sequence? Does anyone know what I should be googling here?
We never finished it because of some truly difficult questions, and decided that failure was the most reasonable result. There were some truly awesome things though. For example, if two users independently connected the same two flowchart steps with the same two lines, those two changes would be understood as identical by the system (by nature/emergence). It was also very granular snapshot consistency, which is incredibly simple to reason about.
The other cool thing was that CAS (again, a Git concept) is absurdly cache friendly, and all forms of fancy stuff can be done there.
We ultimately failed it because we had messed up somewhere and wound up with tons of borderline cases (and the hope was that we wouldn't need them). I still think, given the absence of our mistakes, it could work and would be really elegant code.