Comdb2 – Bloomberg's distributed RDBMS under Apache 2 (opens in new tab)

(github.com)

240 pointsCallicles9y ago65 comments

65 comments

47 comments · 9 top-level

NickGerleman9y ago· 21 in thread

This is really cool but I'm curious why Bloomberg would need this. Ie, what special needs does Bloomberg have that would lead to a primarily non-engineering company investing the resources to create this. Was there nothing off the shelf that would have fit their needs?

I don't mean that in a derogatory way, I'm just curious what motivated making this.

raldi9y ago

"Non-engineering company" -- ha!

But to answer your question: Mike Bloomberg's autobiography talks about this. When they started in the 80's there wasn't as much great off-the-shelf software as there is today.

Their customers were (and always have been) insanely demanding when it comes to reliability and speed. The last thing Mike wanted to do was be caught sheepishly explaining to the CIO of Merrill Lynch, "Well, gee, our Oracle database has this bug they can't fix for the next two weeks..." or even "...this optimization they can't make for the next six months"

This is a company that invented its own layer-2 network protocols 35 years ago just to squeeze every last drop of performance and reliability out of the hardware. Of course they wrote their own database (actually two -- there was a comdb 1 of of course, too)

apaprocki9y ago

Just for some scale, we have 4k+ engineering employees and if we were a public company, we'd be somewhere around the 5th largest software company in the world by revenue. (Strange stat, but our primary product is software and accounts for over 3/4 revenue)

raldi9y ago

Looking forward to seeing a fastsend / PRCCOM paper one day.

1 more reply

magnawave9y ago

I definitely wouldn't call Bloomberg a "non-engineering" company if you've ever worked with any of their trading / market data products. Pretty much true of any trading side finance company nowadays.

adenadel9y ago

Bloomberg started as a technology company. Sure, they have expanded into media, but I'm pretty sure they are still first and foremost a provider of software (plus hardware) and data.

raldi9y ago

I would bet that the Terminals are responsible for something like 140% of their profits, and that the media empire is a huge loss leader / vanity project.

1 more reply

alexjscotti9y ago

So. We did a few things better than the industry to this day. HA is a spectrum. Transparent masking of all failures in any state of arbitary SQL transaction is on the far right side of that spectrum.

To directly answer your question: go look at where 'replication' was on rdbms 14 years ago. PostgreSQL? MySQL?

ericpearl9y ago

14 years ago Sybase had strong replication. Sybase was wall street's darling for a long time starting in the late 80's.

75dvtwin9y ago

>>what special needs does Bloomberg have that would lead to a primarily non-engineering company investing the resources to create this

They probably open sourced the database, and would do more -- to dispel the above notion, and therefore to attract more qualified recruits.

Financial services companies, and technology suppliers around them, have came up much of the 'modern day' tech, some times decade(s), before google's and facebook's of the world

  -- have had NoSQL with stored procs(eg Goldman S, early to mid 90s ),   
  -- in-house grown programming lang (APL+ based in Morgan Stanley) emphasizing vector-based operations
  -- Smart contracts 
    https://www.lexifi.com/product/technology/contract-description-language
  (where a contract is represented as algorithm specified in a domain specific language) 
    ).  This was way before etherium, started, I believe, in Credit Swiss (but not sure)

   -- one of the fastest time series database (kdb+)
   -- Data science and machine learning (modeling risk and valuations)

Today's hedge funds manage petabytes of data. So do many of the big investment banks...

jroblak9y ago

https://bloomberg.github.io/comdb2/overview_home.html

>> We had several goals in mind. One was being wire-format compatible with an older internal product to allow developers to migrate applications easier. Another was to have a database that could be updated anywhere and stay in sync everywhere. The first goal could only be satisfied in-house. The second was either in its infancy for open source products, or available at high cost from commercial sources.

foobarbazetc9y ago

I've worked at midsize "non-engineering" companies that do boring stuff like logistics where they wrote their own RDBMS and even virtual machines for legacy hardware.

It's not that uncommon especially if you've been around a long time.

addicted9y ago

Bloomberg was doing cloud computing decades before the term was even invented.

Edit: As someone else points out, they were doing cloud computing before the internet existed.

keeptrying9y ago

People forget that Bloomberg had computer networks before the Internet.

I remember when we didn't even have comdb2.

joshu9y ago

Bloomberg is a massive tech company.

zerr9y ago

If even advertising companies are engineering shops, what makes you think fintech companies aren't?

gaius9y ago

Right, it's like everyone's forgot what Google's business model actually is!

shusson9y ago

The work started 12 years ago when there were probably no `NewSQL` [1] solutions out there.

https://en.wikipedia.org/wiki/NewSQL

zeusk9y ago

Bloomberg, after all, is a high-tech news and information delivery network.

nemothekid9y ago

>Was there nothing off the shelf that would have fit their needs?

I can't think of any solid distributed RDBMS that would have been around in 2004. Does anyone with more knowledge have any idea (open or not)?

Keyframe9y ago

IBM DB2 and Oracle 9i were around for sure. Other solutions as well. DB landscape was well-established back then.

1 more reply

gaius9y ago

DEC RDB might have done it, if it was still available.

Scaevolus9y ago· 4 in thread

Reading through their VLDB paper [1], Comdb2 appears to be a moderately scalable (up to dozens of nodes (?)) RDBMS with a strong emphasis on consistency and availability. Benchmarks show numbers comparable to Percona XtraDB (MySQL with a different storage engine), at ~2,000 writes/s and 2,000,000 reads/s against a 6 node server cluster. High availability and global sequencing is provided by using GPS clocks, similar to Spanner/Truetime.

Schema changes happen lazily, with old rows being rewritten on the next update, and a background job doing bulk rewriting.

Scalability: "While reads scale in a nearly linear manner as cluster size increases, writes have a more conservative level of scaling with an absolute limit. The offloading of portions of work (essentially all the WHERE clause predicate evaluations) across the cluster does help scale writes beyond what a single machine can do. Ultimately, this architecture does saturate on a single machine’s ability to process the low level bplogs."

This doesn't provide the horizontal scaling that Spanner does, CockroachDB aims at, or FoundationDB presumably has.

[1]: http://www.vldb.org/pvldb/vol9/p1377-scotti.pdf

idibidiartists9y ago

<< or FoundationDB presumably has >>

You mean FaunaDB?

elvinyung9y ago

FoundationDB was the proto-CockroachDB (or rather, CockroachDB is essentially a re-attempt at building FoundationDB). It was an early attempt at building a NewSQL database. (NewSQL per se, i.e. not counting parallel databases from the pre-NoSQL age, like Gamma [1], Volcano [2] and Grace [3], which share many of the same design principles.)

FoundationDB was acquhired by Apple, but its failure is generally attributed to a poorly-performing SQL layer: https://www.voltdb.com/blog/2015/04/01/foundationdbs-lesson-...

[1] http://pages.cs.wisc.edu/~dewitt/includes/paralleldb/ieee90....

[2] https://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b...

[3] https://pdfs.semanticscholar.org/a7f4/e4e6166dc683e7fa7d5b9e...

Scaevolus9y ago

No, FoundationDB which was posting some VERY impressive numbers before being acquired by Apple in March 2015.

Here they were doing 15M writes/s on 32 16-core servers, at a rate of 30,000 writes/s/core: http://web.archive.org/web/20150427041746/http://blog.founda...

FaunaDB managed 120,000 writes per second on 15 machines. https://fauna.com/blog/distributed-acid-transaction-performa...

(Yes, not equivalent benchmarks, but that's still a 50x difference in magnitude.)

idibidiartists9y ago

Down voted for asking a question? The system is definitely broke in such instances.

1 more reply

ihenriksen9y ago· 4 in thread

I see Comdb2 requries SQLite to install. So, I'm guessing Comdb2 is a distributed storage engine for SQLite, or?

mponomar9y ago

There's a heavily modified SQLite embedded in Comdb2 for query parsing/planning. There's some auxiliary tools that can optionally use SQLite (or Comdb2). It'll run without installing an SQLite package.

kanwisher9y ago

I believe originally it wasn't sql, and they added SQL by using the sqlite parsing engine for SQL. They are a massive contributor to the sqlite project. It is in no way supposed to be compatible with sqlite

czinck9y ago

I think you're mixing 2 things. This is a complete rewrite of an old key-value store (hence the 2 in the name) but comdb2 was always SQL. Comdb2 shares a few things with comdb, but they're really just to make migration easier (the preference for tags in csc2 files instead of usual DDL, and then a tag based API that it looks like we got rid of for this release). Under the covers comdb2 is completely different and as far as I know shares no code with comdb.

1 more reply

ihenriksen9y ago

So, you can run Comdb without SQLite?

1 more reply

macdice9y ago· 3 in thread

Some assorted interesting points:

* optimistic concurrency control (sometimes you need to retry, but often the optimism pays off)

* serializable transaction isolation (something like but not exactly like ssi, rather than 2pl)

* ieee 754-2008 decimal floats

* undo-based mvcc (writers don't block readers)

* group (network) sync replication

* paxos based failover

* lua stored procs

Interesting technology and I'm very happy to see it open-sourced. Kudos to the team. (I used this when I worked there. Few firms can pull off something like this in-house; they could. You wouldn't believe how much data they store in this thing.)

microcolonel9y ago

You know, I never really understood the point of decimal floats. Why use floats at all if the point is to express beeps and dollar ppm? You might as well just pick a fraction size and use uint64_t to my mind, or just some bigint type (which is still going to be faster than decimal floats).

macdice9y ago

The point is that it's hard to pick the fraction size. US national debt vs Italian lira/Swiss franc exhange rate, humans generate numbers of wide ranging scales and yet it's convenient for computers to deal in fixed size datatypes. Hence floating scale. But yeah, it's a compromise. I expect these new types to catch on and be added to various language standards over the coming years, but we'll see.

DenisM9y ago

Small parts, like screws, can be priced in microdollars for accounting purposes. At the same time the US GDP is on the order of $20 trillion. 64 bit is not large enough to fit both of these types of numbers.

Makes using SQLITE a real pain...

nodesocket9y ago· 3 in thread

Somewhat annoying you have to copy data to nodes manually:

    copycomdb2 mptest-1.comdb2.example.com:${HOME}/db/testdb.lrl

This sort of stuff is why I loved RethinkDB. They handled all these complexity details for you.

ketralnis9y ago

I'm not too offended by letting the administrator figure out the best way to do an uncommon operation. As a point of comparison, Cassandra does have a proper way to bootstrap new nodes but I've found that in many cases it's better to short circuit it and rsync the initial data myself (and use its repair functionality to clean up the mess).

Some reasons include throttling load on the "old" servers, better feedback on progress, the ability to pause/resume, or even being able to do it faster than the DBMS can e.g. by snapshotting the disk on the source machine and making a CoW clone of it. Heck, if you're running your own hardware and feeling a little reckless, pull out one of the drives from the source machine's RAID mirror and you've already got a full clone right there.

I guess you could build all of that into the DBMS, but it's a rather specialised manual operation that's not happening all that often and it's one of the cases that the administrator almost certainly does know better

coredog649y ago

How does that work? As near as I can figure, you need to have all the sstable files from all nodes in a rack on disk. Most will be discarded on "nodetool cleanup", but I would expect it to have to rewrite all the files due to the new token range.

2 more replies

dankohn19y ago

RethinkDB is still around (present tense). It has been re-licensed under Apache-2.0 and a community is building to move it forward.

https://www.joyent.com/blog/the-liberation-of-rethinkdb

gleenn9y ago· 2 in thread

Nice they have a JDBC driver... makes it a lot easier to hook into. After looking at all the C++ jobs on Bloomberg's website, makes sense the schema format looks C++ish.

pjmlp9y ago

They have a few people on the ANSI C++ process, including Bjarne. :)

maxlybbert9y ago

I think Bjarne works for Morgan Stanley. Bloomberg has John Lakos, Alisdair Merideth, Dietmar Kuhn and others (although I'm not sure which of those are officially on the committee).

2 more replies

boxfish9y ago· 1 in thread

What's the motivation for Bloomberg to open-source this?

rusanu9y ago

Probably the same motivation Yahoo had to release Hadoop, FB to release Hive, Netflix to release so much of their libs and so on and so forth:

- if nothing else, it does no harm (no 'secret sauce' competitors could benefit from)

- it buys karma (think recruiting goodwill)

If the project catches on though then there are many advantages:

- it can spark a self-sustained ecosystem that can further drive the product, at much lower cost for original creator (think Hadoop leading to Cloudera, Hortonworks etc). Product improves, bugs are fixed, toolset matures

- newhires come with know-how to use your internal tools, lower ramp up, better productivity. Anecdotal, but when I was at Microsoft no newhire knew how to use the internal Cosmos stuff, and even among old timers more folk were familiar with Hadoop...

shusson9y ago

I wonder how this compares to CockroachDB

1 more reply

brian_herman9y ago

Neat! It supports wsl! Even better! O_o

j / k navigate · click thread line to collapse

65 comments

47 comments · 9 top-level

NickGerleman9y ago· 21 in thread

I don't mean that in a derogatory way, I'm just curious what motivated making this.

raldi9y ago

"Non-engineering company" -- ha!

But to answer your question: Mike Bloomberg's autobiography talks about this. When they started in the 80's there wasn't as much great off-the-shelf software as there is today.

apaprocki9y ago

raldi9y ago

Looking forward to seeing a fastsend / PRCCOM paper one day.

1 more reply

magnawave9y ago

I definitely wouldn't call Bloomberg a "non-engineering" company if you've ever worked with any of their trading / market data products. Pretty much true of any trading side finance company nowadays.

adenadel9y ago

Bloomberg started as a technology company. Sure, they have expanded into media, but I'm pretty sure they are still first and foremost a provider of software (plus hardware) and data.

raldi9y ago

I would bet that the Terminals are responsible for something like 140% of their profits, and that the media empire is a huge loss leader / vanity project.

1 more reply

alexjscotti9y ago

So. We did a few things better than the industry to this day. HA is a spectrum. Transparent masking of all failures in any state of arbitary SQL transaction is on the far right side of that spectrum.

To directly answer your question: go look at where 'replication' was on rdbms 14 years ago. PostgreSQL? MySQL?

ericpearl9y ago

14 years ago Sybase had strong replication. Sybase was wall street's darling for a long time starting in the late 80's.

75dvtwin9y ago

>>what special needs does Bloomberg have that would lead to a primarily non-engineering company investing the resources to create this

They probably open sourced the database, and would do more -- to dispel the above notion, and therefore to attract more qualified recruits.

Financial services companies, and technology suppliers around them, have came up much of the 'modern day' tech, some times decade(s), before google's and facebook's of the world

  -- have had NoSQL with stored procs(eg Goldman S, early to mid 90s ),   
  -- in-house grown programming lang (APL+ based in Morgan Stanley) emphasizing vector-based operations
  -- Smart contracts 
    https://www.lexifi.com/product/technology/contract-description-language
  (where a contract is represented as algorithm specified in a domain specific language) 
    ).  This was way before etherium, started, I believe, in Credit Swiss (but not sure)

   -- one of the fastest time series database (kdb+)
   -- Data science and machine learning (modeling risk and valuations)

Today's hedge funds manage petabytes of data. So do many of the big investment banks...

jroblak9y ago

https://bloomberg.github.io/comdb2/overview_home.html

foobarbazetc9y ago

I've worked at midsize "non-engineering" companies that do boring stuff like logistics where they wrote their own RDBMS and even virtual machines for legacy hardware.

It's not that uncommon especially if you've been around a long time.

addicted9y ago

Bloomberg was doing cloud computing decades before the term was even invented.

Edit: As someone else points out, they were doing cloud computing before the internet existed.

keeptrying9y ago

People forget that Bloomberg had computer networks before the Internet.

I remember when we didn't even have comdb2.

joshu9y ago

Bloomberg is a massive tech company.

zerr9y ago

If even advertising companies are engineering shops, what makes you think fintech companies aren't?

gaius9y ago

Right, it's like everyone's forgot what Google's business model actually is!

shusson9y ago

The work started 12 years ago when there were probably no `NewSQL` [1] solutions out there.

https://en.wikipedia.org/wiki/NewSQL

zeusk9y ago

Bloomberg, after all, is a high-tech news and information delivery network.

nemothekid9y ago

>Was there nothing off the shelf that would have fit their needs?

I can't think of any solid distributed RDBMS that would have been around in 2004. Does anyone with more knowledge have any idea (open or not)?

Keyframe9y ago

IBM DB2 and Oracle 9i were around for sure. Other solutions as well. DB landscape was well-established back then.

1 more reply

gaius9y ago

DEC RDB might have done it, if it was still available.

Scaevolus9y ago· 4 in thread

Schema changes happen lazily, with old rows being rewritten on the next update, and a background job doing bulk rewriting.

This doesn't provide the horizontal scaling that Spanner does, CockroachDB aims at, or FoundationDB presumably has.

[1]: http://www.vldb.org/pvldb/vol9/p1377-scotti.pdf

idibidiartists9y ago

<< or FoundationDB presumably has >>

You mean FaunaDB?

elvinyung9y ago

FoundationDB was acquhired by Apple, but its failure is generally attributed to a poorly-performing SQL layer: https://www.voltdb.com/blog/2015/04/01/foundationdbs-lesson-...

[1] http://pages.cs.wisc.edu/~dewitt/includes/paralleldb/ieee90....

[2] https://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b...

[3] https://pdfs.semanticscholar.org/a7f4/e4e6166dc683e7fa7d5b9e...

Scaevolus9y ago

No, FoundationDB which was posting some VERY impressive numbers before being acquired by Apple in March 2015.

Here they were doing 15M writes/s on 32 16-core servers, at a rate of 30,000 writes/s/core: http://web.archive.org/web/20150427041746/http://blog.founda...

FaunaDB managed 120,000 writes per second on 15 machines. https://fauna.com/blog/distributed-acid-transaction-performa...

(Yes, not equivalent benchmarks, but that's still a 50x difference in magnitude.)

idibidiartists9y ago

Down voted for asking a question? The system is definitely broke in such instances.

1 more reply

ihenriksen9y ago· 4 in thread

I see Comdb2 requries SQLite to install. So, I'm guessing Comdb2 is a distributed storage engine for SQLite, or?

mponomar9y ago

kanwisher9y ago

czinck9y ago

1 more reply

ihenriksen9y ago

So, you can run Comdb without SQLite?

1 more reply

macdice9y ago· 3 in thread

Some assorted interesting points:

* optimistic concurrency control (sometimes you need to retry, but often the optimism pays off)

* serializable transaction isolation (something like but not exactly like ssi, rather than 2pl)

* ieee 754-2008 decimal floats

* undo-based mvcc (writers don't block readers)

* group (network) sync replication

* paxos based failover

* lua stored procs

microcolonel9y ago

macdice9y ago

DenisM9y ago

Makes using SQLITE a real pain...

nodesocket9y ago· 3 in thread

Somewhat annoying you have to copy data to nodes manually:

    copycomdb2 mptest-1.comdb2.example.com:${HOME}/db/testdb.lrl

This sort of stuff is why I loved RethinkDB. They handled all these complexity details for you.

ketralnis9y ago

coredog649y ago

2 more replies

dankohn19y ago

RethinkDB is still around (present tense). It has been re-licensed under Apache-2.0 and a community is building to move it forward.

https://www.joyent.com/blog/the-liberation-of-rethinkdb

gleenn9y ago· 2 in thread

Nice they have a JDBC driver... makes it a lot easier to hook into. After looking at all the C++ jobs on Bloomberg's website, makes sense the schema format looks C++ish.

pjmlp9y ago

They have a few people on the ANSI C++ process, including Bjarne. :)

maxlybbert9y ago

I think Bjarne works for Morgan Stanley. Bloomberg has John Lakos, Alisdair Merideth, Dietmar Kuhn and others (although I'm not sure which of those are officially on the committee).

2 more replies

boxfish9y ago· 1 in thread

What's the motivation for Bloomberg to open-source this?

rusanu9y ago

Probably the same motivation Yahoo had to release Hadoop, FB to release Hive, Netflix to release so much of their libs and so on and so forth:

- if nothing else, it does no harm (no 'secret sauce' competitors could benefit from)

- it buys karma (think recruiting goodwill)

If the project catches on though then there are many advantages:

shusson9y ago

I wonder how this compares to CockroachDB

1 more reply

brian_herman9y ago

Neat! It supports wsl! Even better! O_o

j / k navigate · click thread line to collapse