[1] http://openmirage.org/blog/introducing-irmin -- prev discussion: https://news.ycombinator.com/item?id=8053687
You can still have an update anomaly in an append-only setting. Teacher changes name (by submitting a change-of-name form) - now you need to replace all the class lists.
It references a paper from 2014-2015 so it's at least that new, but still I think it would be easier if the date is shown on the top or something.
I'd say 2015 :)
"7th Biennial Conference on Innovative Data Systems Research (CIDR’15) January 4-7, 2015, Asilomar, California, USA."
Having said that, this is a pet peeve of mine, too. I get frustrated by technical papers with no dates.
== Iffy Content ==
A. The document sometimes mixes technical content with opinion, but does not clearly separate the two. I'm happy to read normative articles in the right context, but I don't think the content here fits the format. For example, "There is an inexorable trend towards storing and sending immutable data." Many of us want to believe that, but is it actually true? I don't see any support or reference for this statement.
B. The technical content sometimes gets muddled. For example, "When storing immutable data within a consistent hash ring, you cannot get stale versions of the data. Each block stored has the only version it will ever have!" I have several concerns with this sentence. First, it is unnecessarily specific. You don't need to use consistent hashing to get the property described; there are other ways. Second, even with immutability, availability is not guaranteed; although you may not get 'different' versions of the data, you may get no data at all. Third, and most importantly, even with versioning, as mentioned in heading 6, you have significant coordination challenges. Immutable systems do simplify coordination, but not enough to justify the author's statement.
C. The terminology is sometimes confusing. For example, the use of the "DataSet" concept confused me more than it helped. I find it to be an unnecessary distinction that did not add clarity. (Not to mention that the term is used well before it is defined.) How is a "DataSet" different from a "data set", exactly? After reading back and forth, I struggle to understand why a new term was needed.
== Questionable Style ==
A. The content reads more like a slide deck than a paper.
B. The transitions between different sections are choppy.
C. The writing is informal, and that is putting it charitably. One section is titled "Normalization Is for Sissies". Trust me, I like bad jokes and puns -- I just don't think they fit the format. "Hard Disks: Getting the Shingles"? Why stop there? Why not add "Yes, ladies and gentlemen, I'll be here all night!"
D. What is the purpose of the gray callout boxes with the exclamation pointed sentences? Stylistically, they seem out of place. It makes the paper look more like a badly formatting dinner menu or fundraising letter and less like a technical paper. For example, "High availability of immutable blocks is available now! Google, Amazon, Facebook, Yahoo, Microsoft, and more keep petabytes and exabytes of immutable data!"
== Summary ==
I would not hold this up as an example for people to read or learn from. I really don't like being this critical, but I feel like it is important for me to say something. I think papers, especially in this area, should meet a certain bar. I've tried to offer constructive criticism.
I hate to say it, but for a minute I wondered if this paper might be tongue-in-check. The graphic with the caption, "Fill out Part 3 and keep the goldenrod page from the back", is sufficiently bad that it is funny. (No offense intended.)
the CAP theorem means you either give _inconsistent_ answers (what is the state now? it's X) to two different nodes, or you sacrifice availability.
this is the paper in which the theorem was formalized. they talk in terms of 'the initial value' of an atomic object, and its 'subsequent values' - so this only makes sense when the system is _implemented_ in terms of objects which have values that change over time as the result of operations:
--- More formally, let v0 be the initial value of the atomic object. Let c~1 be the prefix of an execution of A in which a single write of a value not equal to v0 occurs in G1, e ---
If requests that come in are expressed as "give me the most recent value of A," and the response comes back with a history of all operations done on A along with timestamps, then you're no longer bound by the CAP theorem because 'an operation was missing' is no longer a _problem_ - its' just an outdated answer.