So if you had to optimize for raw speed, why not choose mutable data?
Conceptually yes, but the implementation doesn't always necessarily need to work that way under the hood: https://www.roc-lang.org/functional#opportunistic-mutation
That's not generally true. Many immutable languages are using "persistent" data structures, where "persist" here means that much of the original structure persists in the new one.
For more, see:
- Purely Functional Data Structures by Okasaki: https://www.cs.cmu.edu/~rwh/students/okasaki.pdf - Phil Bagwell's research - e.g., https://infoscience.epfl.ch/record/64398/files/idealhashtree...
That is not true in general. There are plenty of data structures that can be updated without forcing a full copy. Lists, trees, sets, maps, etc. All of these are common in functional programming. This is discussed in the article (e.g. "Append-Only Computing").
while querying a database each transaction sees a snapshot of data (a database version) as it was some time ago, regardless of the current state of the underlying data
https://www.postgresql.org/docs/7.1/mvcc.htmlI've made a few non-technical eyes go wide by explaining A) that this is done and B) how it is done. The non-tech crypto/blockchain enthusiasts I've met get really excited when they learn you can make a set of data immutable without blockchain / merkle trees. Actually, explaining that is a good way to introduce the concept of a merkle tree / distributed ledger, and why "blockchain" is specifically for systems without a central authority.
(Bi)Temporal and immutable tables are especially useful for things like HR, PTO, employee clock activity, etc. Helps keep things auditable and correct.
Who needs more than one table ? >:)
More complex models can be built and stored separately. The great benefit of this method being that, once you're unhappy with your table model, you can trash it and rebuild it from scratch without regard for data migration.
This does require either knowing the schema at the point in time or recording enough information to do a schema on read.
The other options are of course you basically run a table like an API, always adding, never removing.
It gets tricky when you need to change the schema without breaking historical data or queries. SQL databases could do a lot more to make immutability easier and widespread.
Databases could treat the columns as the fundamental unit with tables being not much more than a view of a bunch of columns that can change over both space (partitioning) and time (history).
This is a very important points, for whatever systems or solutions that you do, do not overengineer and always remember premature optimization is the root of all evil.
It used to be blockchain and it seems apparently ML/AI is the new fad. Most probably majority of the solutions being design now with ML/AI does not need it and in doing so just make it expensive/slow/complex/non-deterministic/etc.
People need to wake up and smell the coffee, since ultimately ML/AL it just a tool inside the many tools toolbox.
I have also seen a scheme where you store the hash, and have a separate lookup table for sensible data, that you can redact more easily without messing with the log.
Seems like a very powerful architecture that is both simple and decouples many concerns.
Substitute streams for tape i/o, and this description of Samza sounds like it could be very similar to that vision.
* as far as I know, their exposition of the WAL and tradeoffs in its implementation has aged well. Any counter opinions?
Immutability Changes Everything (2016) - https://news.ycombinator.com/item?id=27640308 - June 2021 (94 comments)
Immutability Changes Everything - https://news.ycombinator.com/item?id=10953645 - Jan 2016 (4 comments)
Immutability Changes Everything [pdf] - https://news.ycombinator.com/item?id=8955130 - Jan 2015 (25 comments)
(Reposts are fine after a year or so; links to past threads are just to satisfy extra-curious readers)
It often works out, but if you're not looking at the right version then you're risking a merge conflict.