However, when I was experimenting with CRDTs a while back, it seemed to me the other big issue is where multiple transactions from different users combine to create an invalid state. Most CRDT toolkits are aiming for a combination of a JSON like type structure, with the addition on specific structures for rich text. The JSON structures for general purpose data, and those for rich text as that is the most common current use case. These sort of general purpose CRDTs don't have a way to handle, say, the maximum length of an array/string, or minimum and maximum values for a number. They leave this to the application using the toolkit.
For the Yjs + ProseMirror system, effectively the CRDTs are resolved outside of ProseMirror. Thats useful as they can be combined/resolved on the server without a ProseMirror instance running. However there is a strong possibility that the resulting structure is no longer valid for your ProseMirror Schema, this is currently handled by ProseMirror throwing away the invalid parts of the document when it loads.
What I think is needed is a Schema system that is a layer either on top of these toolkits, or as part of them, that provides rules and conflict resolution. So there is a way to specify the maximum length of an array/string, or what the maximum value of a number is. Effectively generic CRDTs that have an understanding of developer supplied rules and bounds.
The "maximum length" is an interesting one, as depending on the order of transactions you could end up with a different result.
Havent had documents corrupted by Yjs allowing changes that are not parseable by schema though - has this happened to you?
And about the schema layer on top of Yjs, you possibly could inspect every update and apply some validation rules. Arent all operations just inserts, updates or deletes of nodes? You can at least rollback to previous version as you flush the updates to the doc in the db. Not ideal though.
Say for example you have a <figure> node that can contain a single optional <caption>. If two users concurrently add a caption, then merge their changes, the Yjs document will contain both. It has no concept of what the valid structure is. When this is loaded into the ProseMirror the second caption will be dropped.
With a centralized schema provider, you run a connected node on a trusted server and reject changes that are out of schema or should not be accessed by a user.
An owned object is an object where a user (or user group that votes via quorum) that owns the object can veto changes to the object. The changes are temporarily applied until accepted by the owners. I haven't dug deep enough into this BFT implementation to know how our model would map to this model.
(specifically, the data representation part)
Here's some source code for an early, work-in-progress Wiki CRDT: https://github.com/hyperhyperspace/wiki-collab/blob/master/s...
Page in the Wiki. Note that data types have a validate method that returns true or false; maybe if false, they're just dropped from the UI? Not sure how the method is used. https://github.com/hyperhyperspace/wiki-collab/blob/master/s...
I haven't found the underlying text or rich text CRDT implementation yet.
Hence projects like https://www.antidotedb.eu (CRDT database in Erlang)
BFT - Byzantine Fault Tolerant [0]
CRDT - Conflict-free Replicated Data Type [1]
[0] https://en.wikipedia.org/wiki/Byzantine_fault
[1] https://en.wikipedia.org/wiki/Conflict-free_replicated_data_...
I don't like unexplained acronyms/initialisms.
1: a usually roofed and walled structure built for permanent use (as for a dwelling)
2: the art or business of assembling materials into a structure
The best kind of blog post!
[0]: https://martin.kleppmann.com/papers/bft-crdt-papoc22.pdf
> Ours (Basic) 27.6MB
> Ours (BFT) 59.5MB
> Automerge (Rust) 232.5MB
I would expect adding the public key tracking to use more memory; I wonder how Automerge is spending so much more memory. Possibly on a bunch of internal caches or memoization that give the order-of-magnitude improvement in speed?
> Ops: 100k
> Ours (Basic) 9.321s
> Ours (BFT) 38.842s
> Automerge (Rust) 0.597s