undefined | Better HN

0 pointsvictorNicollet7y ago0 comments

>> can also be synchronous > Then you're giving up several of the benefits of CQRS, and might as well just not bother with the additional complexity.

Even without asynchronous read/write, there are still benefits worth the (arguably, small) additional complexity. For instance, the ability to add new functionality without having to migrate existing data is amazing.

0 comments

4 comments · 1 top-level

vorpalhex7y ago· 3 in thread

You still have migrations, but they exist on the read store, which means they can be done semi-transparently to clients (pause writes and let them queue up, migrate existing read store to new instance, point read calls and indexers to new store, resume writes).

CQRS adds a lot of complexity. Sometimes it's absolutely worth it, especially if you've already invested in the expertise and tooling to support it. It drastically changes the scaling math, both on the low end (at the very least you need a write store, a read store, a queue, an indexer and an api) and on the high end (you can scale any part of the system as needed in relative isolation). You add in timing issues, rollbacks, asynchronous error handling, and delays between reads/writes.

jmull7y ago

I don't think you necessarily have to have all this in a useful CQRS system.

E.g., here's a real-world example of a general pattern in which CQRS is pretty simple and useful:

A walking/running app which tracks distance, time, and other relevant information over the course of a workout.

It collects a series of events like "location changed at time X", "user started workout", "user paused workout" etc., over the course of the workout period (actually starting before the user officially starts the workout), and converts these to time, distance and other stats.

This fits CQRS really well since the input is inherently a series of events and the output is information gleaned from processing those events.

You get a full CQRS system by simply fully logging the events.

The advantage is that you can go back and reprocess the sequence if you want to glean new/different information from the sequence. E.g., in the walking/running app you could, after the fact and at the user's discretion, detect and fix the case where the user forgets to start or stop a workout. Or recalc distance in the case where you detect a bug or misapplication of your smoothing algorithm, etc. Or draw a pace graph or whatever.

In all these cases you can process the events synchronously.

I put this all in terms of a workout app, but there is a general pattern of an event-driven session-based activity or process, where you may want/need to derive new information from the events and the cost is to log the events (in a high-fidelity form so they could be fully reproduced.

Whether or not you need to use queues and distributed data stores is an independent decision.

vorpalhex7y ago

That's not "Command Query Responsibility Segregation" (CQRS). That's modeling your data as a time series - which is a totally valid and perfectly useful model in many cases, but has nothing to do with the architectural pattern known as CQRS.

Martin Fowler gives the following simple definition of CQRS:

> At its heart is the notion that you can use a different model to update information than the model you use to read information.

CQRS takes data (events, time series, plain old records, whatever), stores it into a write store which acts as a singular source of truth, and queues that data to be stored in a (usually eventually consistent) read store as a projection - not simply duplicating the write store, but building a read record meant to be consumed directly by some client - with potentially several projections, one for each set of client needs.

1 more reply

victorNicolletOP7y ago

No need to pause writes or migrate stores. You can have the old and new versions of the read model co-exist and read from the same event stream without disabling writes. Once the new version is deployed and running, you shut down the old version (isn't this already your deployment model?)

It does change the scaling math, but don't automatically assume that every ES+CQRS system is intended for thousand-writes-per-second, terabytes-per-day kind of scales. If the system stays at ten-writes-per-second, megabytes-per-day, a read store (beyond in-memory projections), a queue or an indexer are not necessary.

As for asynchronous error handling, and delays between reads/writes : why would that be a consequence of ES+CQRS ? It would be a consequence of implementing ES+CQRS with asynchronous or eventually consistent behaviour...

j / k navigate · click thread line to collapse