Partition tolerance does not mean your distributed system can't be consistent and available because your network dropped one packet, or one node failed. What would be the point of such a definition? Instead, the CAP theorem implies that while the network is partitioned, consistency or availability must be sacrificed. In the case of the dropped packet, once it is retransmitted the partition is healed and progress can be made. Or in the case of the failed node, nothing says that the rest of the system can't be consistent and available, so that the system as a whole maintains that property. There is no requirement that the unavailable node be available.
Truly partition tolerant systems are those that continue to function in the face of a prolonged partition, and those are the systems that must sacrifice either consistency or availability.
"Consistent sometimes" is not the same thing as "Consistent" and "Available sometimes" is not the same thing as "Available" - and so "Consistent and Available sometimes" is not the same as "Consistent and Available".
I believe you might be guilty of confusing "Eventual Consistency" with "Consistency".
Funnily enough, no-one has found much use for "Eventual Availability" so far.
A distributed system is considered available if "every request received by a non-failing node [results] in a response." It does not mean you cannot retransmit or retry.
Similarly, the consistency guarantee only requires that there exist a total order on operations. Failures are ok, as we're allowed to retransmit, retry, and otherwise tolerate faults. There is no inconsistency, nor is anything eventual.
My point is that a "temporary" partition is just a fault, and as long as the fault is shorter than the allowed response time of the system, it doesn't make a difference.
If the network is allowed to drop packets there are times where either you must not respond to requests (as doing so would violate sequential consistency) or you must respond incorrectly, potentially with stale information.
The network partition that forces you to drop one of these guarantees might be quite dramatic - but note that, for example, a quorum system is only available on the majority side of its partition. Therefore if a single node is unable to deliver messages (due to a network partition event), it will not be able to correctly respond to requests and must either not respond to its clients or respond incorrectly.
So then what am I missing? It sounds to me like the rest of your post is reiterating what you just called incorrect.
What Redis Cluster will guarantee is that you can have M-1 nodes, with M being the number of replicas per "hash slot", that can go down, and/or get partitioned.
So this is a form of "weak" tolerance to partition, where at least a given percentage of the nodes must remain up and able to talk to each other.
But in the practice this is how most networks work. Single computers fail, and Redis Cluster will be still up. Single computers (or up to M-1) can experience networking problems, and Redis will continue to work.
In the unlikely condition that the network is split in two halves the cluster will start replying with an error to the clients.
This means that the sys admins have to design the network so that it is unlikely that there are strange split patterns, like A and B can talk to C that can tolk to D but blablabal... in high performance network with everything well-cabled and without complex routing this should not be a problem, IMHO.
In the rest of your post, you seem to be sacrificing consistency; one server is down, and thus not receiving any updates from the other servers when data gets updated.
I'm not sure you understood the point of the article, so I'll try to restate it: When part of your system goes down (and it will), you can choose between refusing requests, in which case you sacrifice availability, or serving requests, in which case you sacrifice consistency, since the part of the system which is down cannot be updated when you update data, or cannot be queried in the case of data which is insufficiently replicated. You _cannot_ choose both, since that would require communicating with the downed server.
In Redis Cluster there is no cluster data communication if not for resharding that only works when the whole cluster is on and is done by the sys administrator when adding a node.
So in normal conditions, a node will either:
1) Accept a query, or
2) Tell the client: no, ask instead 1.2.3.4:6380
All the nodes are connected only to make sure the state of the cluster is up. If there are too much nodes down from the point of view of a single node it will reply to the client with a cluster error.
What I'm sacrificing is only consistency because in every given time there is only a single host that is getting the queries for a given subset of keys.
The exception is in the resharding case that is also fault-tolerant. Or slave election (fault tolerance is obtained via replicas).
As a side note, the clients should cache what node is responsible for a given set of keys, so after some time and when there are no failures/resharding in act, every client will directly ask the right node, making the solution completely horizontally scalable.
Dummy clients will just do always the ask-random-node + retry stage if they are unable to take state.
Edit: there are little fields like this that are totally in the hands of academia. My contribution is from the point of view of a dummy hacker that can't understand complex math but that will try to be much more pragmatic.
Also, beware of confusing your local failure detector for a global one. The emergent behavior is a pain when you don't take this into account.
Although it is pretty much the only choice for general purpose file sync, it's still not a good choice. It is difficult to train non-technical staff to deal with inconsistencies on resync, and they don't want to have to deal with it. I've had success with Synctus precisely because it guarantees consistency (it is CP, and any one node keeps the A for a given file). Of course, this only works for mostly-on systems.
Do I understand correctly?
For a CA system, any node which is unable to assure global consistency reports itself as failed. It will neither return bogus results, nor hang indefinitely.
Reporting an error condition counts as an availability violation.
The reason failures are hard to detect in asynchronous networks is that permissible message transit times are unbounded; ie they refuse to acknowledge the presence of any partition. If you acknowledge the possibility of partitions, then your system is by definition not asynchronous.
Reporting an error condition counts as an availability violation.
This is bullshit. Per the definition quoted in the linked article, availability only means that "...every request must terminate.". It is not required that it terminate successfully.
This lack of availability is different from the availability in the A of CAP, since that availability holds only so long as the network is not partitioned (by definition in a CA system).
Such a system might not be considered a distributed system at all (although it may still be distributing load), since a partition-intolerant system is effectively one system as far as the CAP theorem is concerned.
So it's essentially a special case of the CAP theorem, but it is still useful to describe it as CA.
Therefore it's not a CA system, but a C system.
I think this is the tweet that prompted the post: http://twitter.com/JamesMPhillips/status/26502076366
http://www.cloudera.com/blog/2010/04/cap-confusion-problems-...
3 nodes, 2 must agree. When the partition heals, the resolution process happens. It would have to be a SERIOUSLY bad network design and quorum setting that allows a quorum on both sides of the split.
It's just like eventual consistency. We're not talking days or even minutes. We're talking milliseconds/seconds of partition split. If you have a partition split for days, you have OTHER issues to address.