What do you suggest is the sweet spot for document size and "hotness"? Your cookbook [0] says "We suspect that an Automerge document is best suited to being a unit of collaboration between two people or a small group." Does that mean tens of kilobytes? Hundreds? More? And how much concurrent contention is viable? And is the "atom of contention" the document as a whole, or do you have any plans for merging of sub-parts?
Also, do you have support for juggling multiple transports, either concurrently or back-to-back? In particular, I'm thinking about synchronizing via the cloud when connected, and falling back to peer-to-peer when offline. In that peer-to-peer case, how many peers can I have, and can my peer network behave as a mesh, or must it stick together to some degree?
And finally, it looks like your tutorial [1] doesn't actually exist! You refer to it in a blog post [2], but it's a dead link.
[0] https://automerge.org/docs/cookbook/modeling-data/
As for network transports you can indeed have multiple at once. I usually have a mix of in-browser transports (MessageChannels) and WebSocket connections. I suspect we'll need to do a little adjusting to account for prioritization once people really start to push on this with things like mDNS vs relay server connections but the design should accommodate that just fine.
As for the docs, my apologies. The "tutorial" was merged into the quickstart as part of extensive documentation upgrades over the last few months. We should update the link in the old blog post accordingly.
Here's a link to save you the effort: https://automerge.org/docs/quickstart/
So if I smoosh everything in my sorta “collaboration context” together into one document, are there any provisions for delta updates on the wire? Your browser-side storage format sounds like it’s compatible with that approach, but what about clients that are far apart version-wise? Are you storing full relay history and also a snapshot?
I see in your format docs [0] that you store change chunks. Are these exposed in the API for atomicity at all? Are there any atomicity guaranties?
And you discuss backends, but I don’t see any pointers to an S3 or Postgres implementation. Is that something you’re keeping closed source for your business model, or am I just missing something?
I haven’t found anything about authorization. Have you done any work there? I quite like the Firebase model in which you can write simple validation rules that can evaluate against the document itself —- “only allow users who are listed in path `members` to write to this document” or whatever.
[0] https://automerge.org/automerge-binary-format-spec/#chunk-co...
The backends you see are the ones I use, but the API is a binary blob key value store with range queries: supporting other stores should be straightforward.
Authentication isn’t exactly left as an exercise to the reader but is an area of active work. I would say securing access to a URL via whatever mechanism you’re used to should be fine for client server applications and peer to peer folk seem to mostly have their own ideas.
https://www.youtube.com/watch?v=Mr0a5KyD6BU
Also, check out the unconf for localfirst that happened right after 2023:
https://github.com/LoFiUnconf/stlouis2023
Ink & Switch is doing such interesting stuff. Their after party at StrangeLoop was so cool.
Edit: Typo `autosurgeon-repo-rs` to `automerge-repo-rs` and link. https://github.com/automerge/automerge-repo-rs
The API is still a little clunky, with hydrating and reconciling, and it's not as clean as the automerge-repo one, especially with those React examples.
disclaimer: i’m a co-author and the paper is focused on a different CRDT framework, but point is that it measures Yjs and automerge side by side
The benchmark that concerns me (and that I'm pleased with our progress on!) is that you can edit an entire Ink & Switch long-form essay with Automerge and that the end-to-end keypress-to-paint latency using Codemirror is under 10ms (next frame at 100hz).
While these kinds of benchmarks are incredibly appreciated and absolutely drive us to work on optimizing the problems they uncover, we try to work backwards from experienced problems in real usage as our first priority.
> In our research, we've found that editing is usually serial or asynchronous.
Medium-to-large-size company with a town hall = many people editing a document at the same time. Workshop at a company or a university with a modest size classroom = many people editing a document at the same time. I can't tell you how many times our web-based collaborative code editors would fall over during talks with small audiences we would give back in the days when I led the Scala Center.
Just because one of the benchmarks you have seen (of a multitude of benchmarks) breaks automerge by stressing it in what we believe is the most stressful scenario possible– multiple concurrent users, which is sort of the point of concurrency/collaboration frameworks, does not make it artificial or worth so flippantly discarding.
> long-form essay with Automerge and that the end-to-end keypress-to-paint latency using Codemirror is under 10ms (next frame at 100hz)
Not at all what we measured.
I'd just like to register here that Yjs is the framework most widely used "in real usage" (your words) and not automerge (for many reasons, not just performance.)
I've seen Matt's work and I think it's quite reasonable to benchmark a concurrent datastructure under concurrent load. Placing systems under high load, even just as a limit study, is how we reveal scalability bottlenecks, optimize them, and avoid pathologies. It's part of good engineering.
If your work can produce more representative workloads from the real world, then they could add to the field's knowledge with new benchmarks.
We use co-editing far more commonly than serial editing.
Coming from a background of XP (extreme programming, pair programming) and a Pivotal Labs style approach to co-thinking, even for executive work we require everyone in a meeting (whether at conference table or remote) to be in the document being shared, and instead of giving feedback, comment or edit in place.
We care a LOT about how laggy this works, how coherent it remains, or whether it blows up and has to be restarted, or worse, reverted.
If a firm culture "whiteboards" by having one person at the board and everyone else surfing HackerNews, they might not be exercising this. If a firm culture is that whiteboards are a shared activity, everyone gathered around holding their own marker, or even just grabbing it from each other, they might need to exercise CRDTs this way.
Put another way, if you "Share" in conf room with an HDMI cable to a TV, or share in a Teams or Zoom by window sharing, you may not be a candidate.
If you "share" by dropping a link to the document in a chat, and see by the cursors and bubbles who is following along, you are a candidate.
. . .
In "Upwelling" you describe an introverted and solitary creative process, before revealing a sufficient quality update to others.
That is certainly a valid use case for unspooling thoughts from one brain, and if those are the wilds you are observing, makes sense why that's what you'd observe in the wild.
It is not, however, the most productive for inventing solutions to logic puzzles with accuracy and correctness in fewer passes, nor for most any other "group" activity. So maybe your "not what we see in the wild" should be qualified by "but we're actually not looking for live collaboration, we're looking for post drafting merge".
That said, now the choice of the term "auto-merge" is much clearer, advertising your use case right on the tin, if one thinks about it.
So thanks for the upwelling link, repeated here for convenience:
That being said, I love everything automerge is doing and hope this pace will keep up!
We have built a variety of projects with Automerge, both publicly and for use in private, including recently the markdown-with-comments editor we call Tiny Essay Editor (https://tiny-essay-editor.netlify.app/) by Geoffrey Litt.
That said, sponsoring the Automerge team helps us build faster and is always welcome. (Thanks to our current and past sponsors for their support!)
E.g a personal note-taking app where the user will never have any collaborators, but where they expect the app to work fully offline on multiple devices and reliably sync up when they come online.
Automerge is not VC-backed software. Indeed, for a number of years Automerge was primarily a research project used within the lab. Over the last year, it has matured to production software under the supervision of Alex Good. The improved stability and performance has been a great benefit to both our community and internal users. Our intention is to run the project as sponsored open source for the foreseeable future and thus far we have done so thanks to the support of our sponsors and through some development grants.
Ink & Switch's research interests drive a lot of Automerge development but funding from sponsors allows us to work on features that are not research-oriented or to accelerate work that we'd like to do but that doesn't have current research applications. If you adopt Automerge for a commercial project, I'd encourage you to join the sponsors of Automerge to ensure its long-term viability.
For applications with more document-structured data, you can now produce inverse patches using Automerge.diff to go between any two points. To implement a reasonable undo in this environment you can record whatever document heads you consider useful undo points and then patch between them.
To perhaps expand on why the problem remains unsolved slightly further, there was a robust discussion about what the expected behaviour of "undo" out to be in even simple cases at the conference.
I think it's cool, but I still see CRDTs as very niche.
I also want "local-first" but what I really want is something closer to how traditional desktop apps just open up, edit, and save files, not some real time collaboration that is already set up before I add my first collaborator.