In principle you could just query the transaction log for every change to your data and compute the final state every time. Obviously this would be onerous so in normal operation we just use the latest state.
When things go wrong the transaction log is useful for understanding why and also rewinding/replaying the database to the correct state.
Some databases ship these transaction logs around between replicas to keep them all in sync.
The work presented here is an interesting application of the same basic mechanism to keep different flavours of datastores in sync.
Recently we very briefly explored the idea of using this mechanism to implement partial replication for partitioned reporting data stores. Unfortunately our current platform SQL Azure doesn't grant access to the transaction log directly. (Which on balance this is a good thing because it's handling all the replications etc)
Whilst there is an element of compactness when it comes to capturing semantic events, the benefit of using a simpler mechanism like logs means that you don't need to use a full database engine to parse the data, and may end up offering better performance (for example, no need to calculate what a commit rollback entails on every node, just do it on the master node and let the other nodes read the logs to know what to update).
CustomerCreated { stuff }
CustomerMadeOrder { custId, stuff }
ItemAddedToOrder { orderId, stuff }
etc..
This is the event sourcing view of the world.http://engineering.linkedin.com/distributed-systems/log-what...