“Follow-The-Workload” Beats the Latency-Survivability Tradeoff in CockroachDB (opens in new tab)

(cockroachlabs.com)

78 pointsawoods1878y ago28 comments

28 comments

22 comments · 7 top-level

muxator8y ago· 5 in thread

I always longed for the existence of such an Ops-friendly DB, which - building on solid distributed systems concepts - could make easy and correct the hard things (consistency, geographical replication, effective possibility of zero downtime).

Now that I have a good candidate, I find myself wondering whether the performances would be good enough to ditch traditional DBs (I know: I should define "workload" before thinking about "performances"...)

I guess is the curse of tradeoffs: nothing in life is free.

jinqueeny8y ago

Another good candidate is TiDB (https://github.com/pingcap/tidb). It has elastic scalability, ACID compliances, high availability, etc.

At least, TiDB, CRDB, RethinkDB are open source :)

zokier8y ago

> At least, TiDB, CRDB, RethinkDB are open source :)

So is CockroachDB too?

1 more reply

arrty888y ago

We had it with RethinkDB already!

manigandham8y ago

Except it has lacking performance and is now impractical without a backing support company.

a0128y ago

Does rethinkdb have ACID? No?

1 more reply

sempron648y ago· 4 in thread

Can anyone elaborate as to how CockroachDB compares to Scylla? It seems to me like they're very similar on a surface level.

_uhtu8y ago

Sure!

Cockroach DB is a full SQL compatible, strongly consistent key-value store. This means you can get the scalability and performance of a key-value store alongside the comfortable SQL query language. You, and all the other developers who work with you who already know SQL, can do pretty much all the things you know and love from SQL like joins, secondary indexes, etc with full ACID compliance. This means that when you read data from the database you always know with 100% certainty you're going to get up to date values (no stale reads).

ScyllaDB is similar in that data is stored in tables with a defined schema, but it uses a different query language, CQL[1], which is often similar to SQL. You can't to joins but you can have secondary indexes. You can store most of the same data types that you know and love from a standard SQL store. Interestingly enough, you get to CHOOSE the level of consistency you get, so you can make your ScyllaDB strongly consistent or choose from an array of eventually consistent options[2]. Most people however go with one of the eventually consistent options, which allow Scylla to be insanely performant and scalable. At the cost of strong consistency, you get an extremely high performance at an almost infinite scale. CockroachDB, while performant and scalable, can't match it here. It stands almost on a tier of it's own in terms of scalability and performance.

So really, the choice is yours based on what you're looking for. I'd choose CockroachDB for my purposes since I'm not storing Apple levels of data and consistency is important to my work, but your specifications and needs may be different.

[1] http://docs.scylladb.com/getting-started/ddl/ [2] http://docs.scylladb.com/architecture/architecture-fault-tol...

manigandham8y ago

Ultimately the same foundations of a globally distributed key/value store. Scylla is the Cassandra is a mashup of BigTable/Dynamo wide-column advanced key/value design allowing for very high scalability and availability, but the data model and querying abilities are not as flexible. Scylla does have variable per-query consistency settings but is primarily eventually consistent. Does support BATCH statements which are atomic updates, but no Lightweight-Transactions yet to read-before-write, but does have counters now. It's not quite feature parity with Cassandra but quickly getting there.

CockroachDB also uses a key/value store but puts a postgres-compatible SQL layer on top, derived from the Google Spanner approach, so you can get (almost) all the querying abilities and data modeling of a relational database. They're slated to have JSON datatypes soon that will make it very compelling as a general purpose, highly reliable datastore for all of your core data in multiple regions.

zackelan8y ago

If you want to read about the theoretical underpinnings, Cassandra is derived from the original Dynamo paper[0] from Amazon, and Scylla is a drop-in replacement for Cassandra written in C++ instead of Java. Cockroach follows more closely the Google Spanner[1] approach.

For a more practical summary, compare the architecture overviews of Cassandra[2] and Cockroach[3].

0: http://www.allthingsdistributed.com/files/amazon-dynamo-sosp...

1: https://static.googleusercontent.com/media/research.google.c...

2: https://docs.datastax.com/en/cassandra/3.0/cassandra/archite...

3: https://www.cockroachlabs.com/docs/stable/architecture/overv...

openasocket8y ago

They're pretty different. Scylla is essentially a C++ rewrite of Cassandra, while CockroachDB aims to be a true distributed SQL database. Cassandra and Scylla do not support transactions (or atomicity or strong consistency) or joins.

ahahuaya8y ago· 2 in thread

Cockroachdb looks very exciting. I’m waiting for it’s Postgres compatibility to be far enough along that it works with Elixir/Phoenix. Is that likely to happen any time soon?

legojoey178y ago

If you are willing to give it a shot, there is currently a fork of postgrex from someone in the Elixir community! It handles some of rough edges of the incompatibility. You can find it at https://hexdocs.pm/postgrex_cdb/readme.html and the source code at https://github.com/jumpn/postgrex.

For any other issues you run into when using it, you may want to see the discussion about postgrex compatibility at https://github.com/cockroachdb/cockroach/issues/5582. If you run into different problems please do file the issue :).

I'm also an Elixir junkie, so I would also love to see ORM compatibility here! I definitely want to dedicated time to it if I can. (Disclaimer: I'm currently interning at Cockroach Labs.)

cpursley8y ago

I'd love to see a port of https://github.com/begriffs/postgrest to Elixir. Cockroachdb + Elixir-based Postgrest implementation would be the ultimate platform for realtime distributed web scale.

adrianratnapala8y ago· 2 in thread

So is this for helping with read, but not write latency?

I know little about CockroachDB, but I vaguely remember that it has consistent replication. If so, then writes would require some round trips before consensus is reached, killing any latency benefits from having the master follow the load (unless you move all your replicas around!).

a-robinson8y ago

It helps with both, but is more dramatic for reads.

Reads go from 1 RTT to 0 RTTs, while writes go from 2 RTTs to 1 RTT. For example, consider a cluster in which the nodes are 200ms RTT away from each other.

* A read will take 200ms if the leaseholder isn't in the local datacenter vs single-digit milliseconds if it is local.

* A write will take 400ms if the leaseholder isn't in the local datacenter vs 200 milliseconds if it is local.

The write takes 400ms because it takes 100ms to get from the local datacenter to leaseholder, 200ms to reach consensus (the leaseholder needs to send a request to and get a response from one of the followers), and another 100ms for the response from the leaseholder to get back to the local datacenter where the request originated.

humanfromearth8y ago

Really looking forward to see this in action. Is follow the workload already present in 1.1.3?

Is there a bit more detailed technical doc on how this is actually achieved ideally with some examples? When is it decided that the lease holder moves from one node to another? Is it based on some sort of stats?

Edit: found the doc follow-the-workload here: https://www.cockroachlabs.com/docs/stable/demo-follow-the-wo...

1 more reply

toolslive8y ago· 2 in thread

Isn't "Follow-the-workload" something that is naturally achieved when Mencius is used iso Raft?

a-robinson8y ago

WAN-optimized consensus protocols like Mencius and Egalitarian Paxos indeed are better for writes because any node can commit a write in one round trip, but they don't allow for reads with zero round trips. Every read must go through the consensus machinery in order to ensure consistency, whereas in CockroachDB the leaseholder can serve reads without needing to consult other nodes.

toolslive8y ago

why do reads have to go through the consensus machinery ? If they don't go through the machinery, read results will still be consistent, they just might not include the very last updates, but you cannot avoid that. Raft has the same problem (they all have, and it's unavoidable): suppose you read from the Leader (Master, whatever you call it); the leader sends you a perfect fresh consistent value; however, your network takes time to deliver it to you and meanwhile an update might have happened. It just shows that the key value store should provide atomic multi updates (or at least compare-and-swap )

1 more reply

ericand8y ago

For reference, this is part 2 of a two part series. Here is part 1: https://www.cockroachlabs.com/blog/tunable-controls-for-late...

jinqueeny8y ago

Great post! Do you have benchmark to show how CRDB is compared with TiDB in terms of performance and latency? And how CRDB is compared with Amazon Aurora with PostgreSQL compatibility?

j / k navigate · click thread line to collapse

28 comments

22 comments · 7 top-level

muxator8y ago· 5 in thread

I guess is the curse of tradeoffs: nothing in life is free.

jinqueeny8y ago

Another good candidate is TiDB (https://github.com/pingcap/tidb). It has elastic scalability, ACID compliances, high availability, etc.

At least, TiDB, CRDB, RethinkDB are open source :)

zokier8y ago

> At least, TiDB, CRDB, RethinkDB are open source :)

So is CockroachDB too?

1 more reply

arrty888y ago

We had it with RethinkDB already!

manigandham8y ago

Except it has lacking performance and is now impractical without a backing support company.

a0128y ago

Does rethinkdb have ACID? No?

1 more reply

sempron648y ago· 4 in thread

Can anyone elaborate as to how CockroachDB compares to Scylla? It seems to me like they're very similar on a surface level.

_uhtu8y ago

Sure!

[1] http://docs.scylladb.com/getting-started/ddl/ [2] http://docs.scylladb.com/architecture/architecture-fault-tol...

manigandham8y ago

zackelan8y ago

For a more practical summary, compare the architecture overviews of Cassandra[2] and Cockroach[3].

0: http://www.allthingsdistributed.com/files/amazon-dynamo-sosp...

1: https://static.googleusercontent.com/media/research.google.c...

2: https://docs.datastax.com/en/cassandra/3.0/cassandra/archite...

3: https://www.cockroachlabs.com/docs/stable/architecture/overv...

openasocket8y ago

ahahuaya8y ago· 2 in thread

Cockroachdb looks very exciting. I’m waiting for it’s Postgres compatibility to be far enough along that it works with Elixir/Phoenix. Is that likely to happen any time soon?

legojoey178y ago

I'm also an Elixir junkie, so I would also love to see ORM compatibility here! I definitely want to dedicated time to it if I can. (Disclaimer: I'm currently interning at Cockroach Labs.)

cpursley8y ago

I'd love to see a port of https://github.com/begriffs/postgrest to Elixir. Cockroachdb + Elixir-based Postgrest implementation would be the ultimate platform for realtime distributed web scale.

adrianratnapala8y ago· 2 in thread

So is this for helping with read, but not write latency?

a-robinson8y ago

It helps with both, but is more dramatic for reads.

Reads go from 1 RTT to 0 RTTs, while writes go from 2 RTTs to 1 RTT. For example, consider a cluster in which the nodes are 200ms RTT away from each other.

* A read will take 200ms if the leaseholder isn't in the local datacenter vs single-digit milliseconds if it is local.

* A write will take 400ms if the leaseholder isn't in the local datacenter vs 200 milliseconds if it is local.

humanfromearth8y ago

Really looking forward to see this in action. Is follow the workload already present in 1.1.3?

Edit: found the doc follow-the-workload here: https://www.cockroachlabs.com/docs/stable/demo-follow-the-wo...

1 more reply

toolslive8y ago· 2 in thread

Isn't "Follow-the-workload" something that is naturally achieved when Mencius is used iso Raft?

a-robinson8y ago

toolslive8y ago

1 more reply

ericand8y ago

For reference, this is part 2 of a two part series. Here is part 1: https://www.cockroachlabs.com/blog/tunable-controls-for-late...

jinqueeny8y ago

Great post! Do you have benchmark to show how CRDB is compared with TiDB in terms of performance and latency? And how CRDB is compared with Amazon Aurora with PostgreSQL compatibility?

j / k navigate · click thread line to collapse