https://madebyevan.com/algos/crdt-tree-based-indexing/ - for when precise order is critical, like paragraphs in a document. This algorithm is almost like storing adjacency information like a linked list, but is more convergent. Very interesting for [my use-case](https://www.notion.so/blog/data-model-behind-notion).
https://madebyevan.com/algos/crdt-mutable-tree-hierarchy/ - for tree-shaped data, like blocks in a Notion page that should have exactly one parent, but allow concurrent re-parenting operations
https://madebyevan.com/algos/log-spaced-snapshots/ - log space snapshots, for choosing what fidelity of historical information to store. For context, many CRDTs for rich text or sequences store unbounded history so that any edit made at any time can be merged into the sequence. For long-lived documents, this could be impractical to sync to all clients or keep in "hot" memory. Instead, we can decide to compact historical data and move it to cold storage, imposing a time boundary on what writes the system can accept on the hot path. The log-spaced snapshots algorithm here could be used to decide what should be kept "hot", and how to tune the cold storage.
From wikipedia:
> Similarly, Joseph Gentle who is a former Google Wave engineer and an author of the Share.JS library wrote, "Unfortunately, implementing OT sucks. There's a million algorithms with different tradeoffs, mostly trapped in academic papers. […] Wave took 2 years to write and if we rewrote it today, it would take almost as long to write a second time." But later he amends his comment with "I no longer believe that wave would take 2 years to implement now - mostly because of advances in web frameworks and web browsers."
The two most widely used CRDT implementations (combining JSON like general purpose types and rich text editing types) are:
- Automerge https://github.com/automerge/automerge
- Yjs https://github.com/yjs/yjs
Both have JS and Rust implementations, and have bindings to most online rich text editors.
CRDTs are addictive one you get into them.
> It is a Conflict-free Replicated Data Type (CRDT)
What happened to the idea of defining all non-universally-recognised acronyms the first time you use the term? With people making up new terms exponentially faster than ever before, it’s now more important than ever.
They really are a theoretical model of how distributed, convergent, multi-master systems have to work. IE the DT in CRDT could be a whole datastore, not as just an individual document.
(Wish I could remember who on HN alerted me to this. I had read the paper but didn't grok the full implications).
Regardless, thanks very much for linking the paper! Right up my alley.
(We have a central server and the app works offline so the algorithm from the linked article doesn't apply in our case.)
Explanation here: https://news.ycombinator.com/item?id=33764956. Thanks!
When I read about the problem, I imagined using the mediant to calculate “between”. I think that mediant of $value and 1/1 would work for “after” and mediant of $value and 0/1 would work for “before”.
There must be a requirement I’m missing, though!
mediant: https://en.wikipedia.org/wiki/Mediant_(mathematics)
In general some approaches should be better than others depending on how inserts are distributed — I think my approach is optimal when inserts are made at random (I haven’t proved this), but in practice it may be common for a bunch of things to be inserted in sequence for some applications. In those cases, there are more space-optimal approaches.
Disclamer: I'm the author of Dotted LogootSplit.
[Weis_2009] https://hal.inria.fr/inria-00432368
[Nédelec_2013] https://hal.archives-ouvertes.fr/hal-00921633/en
[André_2013] https://hal.archives-ouvertes.fr/hal-01246212
[Elvinger_2021] https://hal.univ-lorraine.fr/tel-03284806
[0] https://stackoverflow.com/questions/40718900/jiras-lexorank-...
Basically a tree of fractions where you take two rational points on a number line, a/b and c/d, then the next point in the tree is (a+b) / (c+d). Turns out that every single point you create this way has a unique position and never duplicate each other, and it forms a tree like structure.
https://en.wikipedia.org/wiki/Ford_circle
not sure if this would be useful, but basically it could be a fractional index that has a built in tree structure, since it basically means any fraction is a leaf on a Stern-Brocot tree.
[1] https://bartoszsypytkowski.com/operation-based-crdts-arrays-...
We should end the CRDT vs OT debate once and for all.
Edit: let's do that in this case. I've changed the URL from https://madebyevan.com/algos/ based on https://news.ycombinator.com/item?id=33764875.