“The benchmark numbers are completely wrong for both databases” (opens in new tab)

(github.com)

165 pointsneumino11y ago57 comments

57 comments

45 comments · 12 top-level

weddpros11y ago· 14 in thread

Well, the author cares enough about RethinkDB to test it, even if he's a mongodb fan, even if his first benchmark was wrong, he was right to publish it: you all helped him when you pinpointed the problems in his tests... Thanks you for that.

I don't see any marketing here, just the "do your own benchmark" best practice, and the "share with community" best practice... Does it make it a perfect benchmark? No, but at least he tried... and the author has corrected the discrepancies since then.

Now imagine the benchmark was against [your favorite DB here] with even stronger results against RethinkDB. Notice how the most upvoted comment is joking about MongoDB. The second one is a pro-mysql comment. What's the point? Would it have been a better benchmark if it read "mysql is 10x faster than RethinkDB?" or "MongoDB is even slower than RethinkDB"?

timmaxw11y ago

> the author has corrected the discrepancies since then

As of the time I posted this comment, the blog post still seems to be comparing indexed MongoDB operations against non-indexed RethinkDB operations. Under those conditions I'd expect RethinkDB to be at least 1000x slower than MongoDB. The fact that he's finding that RethinkDB is only 3x slower than MongoDB makes me think that there are still other major problems with this benchmark.

> No, but at least he tried...

It's true that the author tried; but that doesn't change the fact that people are going to read this blog post and assume that the numbers are at least approximately correct. As a RethinkDB employee, it really frustrates me to see RethinkDB being judged according to benchmarks that are conducted so carelessly that they are essentially random.

I think this is the fourth time in the past year that I've seen a third party try to benchmark RethinkDB and get something wrong. Maybe we need to start a "best practice" of checking in with the maintainers of a project before publishing benchmark results about the project.

weddpros11y ago

Some mistakes in his benchmarks, among probably others:

- I don't see any mongodb index creation, so mongodb is inserting with no index while rethinkdb is inserting with the index. That's probably why there's a gap between the two

- there's no mongodb index, and rethinkdb queries do not make use of the index (this is probably why rethinkdb is not 1000x slower: both aren't using indexes)

- the $in query should be last_update: random_timestamp(), there's no need for $in here

- his insertion code creates 100K memory clones of the object to insert in the mongodb version only, not in rethinkdb

I'm sad to add: what the author is benchmarking here is the likely performance of a system he could build with either db. It's not necessarily bad (save for bad press) that he's bad at benchmarking: the mistakes he's made in his benchmark are similar to the mistakes he'll make in his code.

But yes, the author may use some help!

1 more reply

arielweisberg11y ago

RethinkDB needs to ship its own benchmark client. Also implement an ugh YCSB driver. Provide both with the database download.

It's madness to expect someone new to database benchmarking to implement a correct fully featured benchmark client. They are going to stumble enough on database and instance configuration as it is.

geocar11y ago

> Maybe we need to start a "best practice" of checking in with the maintainers of a project before publishing benchmark results about the project.

http://en.wikipedia.org/wiki/David_DeWitt

fche11y ago

@threeseed

> [...] I've never used a database that required me to explicitly define which indexes I want to use for a read. [...]

In a way, it's traditional (IBM IMS/DB, 1960s).

erbdex11y ago

+1 on 'Profiling best practice'. Any such existing project?

threeseed11y ago

Not sure why you are frustrated it's just a blog post by someone who was inexperienced with your product. At least he owned up to the mistakes and was willing to fix it. It's an opportunity for you to work with the guy to show him how to do it properly and write a blog post of your own.

I would say that you probably should look at your API because I've never used a database that required me to explicitly define which indexes I want to use for a read. But I've never used RethinkDB so maybe there is a legitimate reason.

1 more reply

copsarebastards11y ago

So basically you're saying that because he cares and because he tried, and because the comments were dumb, nobody should criticize him? Do you really not care about getting accurate results?

Running benchmarks is an engineering practice. If you failed to get meaningful results, you failed. Yes, he cares, yes, he tried, yes, the comments are dumb, but he still failed. Sure, I'll give the guy kudos for trying, but I'm not going to pretend he didn't fail. As far as I'm concerned, telling someone they failed is a favor, because now they can change their methodology, try again, and maybe succeed. It's part of the process of achieving meaningful results. The entire point of what he's doing is to achieve meaningful results, not to get a participation medal.

Your response reminds me of this: https://www.youtube.com/watch?v=gSjLiQxEZlM

steego11y ago

> So basically you're saying that because he cares and because he tried, and because the comments were dumb, nobody should criticize him?

I don't think anybody is saying that.

1 more reply

JDDunn911y ago

You must be fun at parties...

1 more reply

sklivvz197111y ago

I suppose the author of the test forgot the "only test realistic scenarios" best practice.

weddpros11y ago

We know nothing about the author's use case...

Maybe he wanted to know where each DB shines compared to each other, to see if some workloads are better suited to one or the other.

Of course, benchmarks "should" include concurrent reads/updates/writes/deletes because it can make a huge difference depending on the DB's implementation.

Of course, the author "should" also have tested sharding / durability / resistance to partition / resource consumption in his tests... Maybe he didn't have the resources to test properly. I also do quick&dirty benchmarks like these, mostly because exhaustive benchmarks cost so much more (time, money, expertise)...

DannoHung11y ago

> "do your own benchmark" best practice

Is this a best practice? It seems like we've been delivered evidence that it is really hard to do good benchmarks unless you're already intimately familiar with what you're testing, which says something about how hard it is to make a good choice.

I don't know about other industries, but this sort of result is what stuff like the STAC M3 Benchmark suite was designed for: Typical usecases that experts can implement so you can get realistic performance comparisons.

sarciszewski11y ago

I have to say, it's rare for me to agree with a HN comment as much as I do to this one. Well said!

jakozaur11y ago· 5 in thread

Benchmarking is hard and a lot of reports are bogus. However they are still very useful for a lot of developers.

Benchmarking programming languages got better. E.g.: http://benchmarksgame.alioth.debian.org/ gives roughly idea about performance of programing languages.

I wish something similar existed for databases. I think exact figures would be hard to get, but I believe there are many 2x 10x differences that we should be aware of.

boomlinde11y ago

> Benchmarking is hard and a lot of reports are bogus. However they are still very useful for a lot of developers.

In this case we were presented with a benchmark setup that failed to perform the task it supposedly benchmarked. That's not hard to avoid, and it makes the benchmark completely useless and misleading.

I don't think that this problem can be generalized in the sense you do here, since the problem with benchmarks usually isn't a complete failure to perform the task to be benchmarked, but things like finding a set of tests that give a fair representation of what you'd typically use the subjects for, or performing the tasks in idiomatic and optimal ways.

amaranth11y ago

Don't a lot of those benchmarks end up only measuring how fast your language can call out to GMP to do the real work? And regex-dna ends up measuring your regex implementation which for a lot of them is again just going to be measuring how fast they can all call out to PCRE.

They're neat and all and it is called the benchmarks game but I wish they'd remove the ones that end up getting gamed like that.

igouy11y ago

>> Don't a lot of those benchmarks… <<

No.

>> …how fast they can all call out to PCRE. <<

No. Have you looked at the programs?

collyw11y ago

In my experience, the database is the best place to perform optimizations, not in the code. I guess a lot depends on what you are doing.

amelius11y ago

> Benchmarking programming languages got better.

I wish we got this kind of benchmark for the altjs solutions.

sklivvz197111y ago· 3 in thread

This old comic piece seems to still apply: http://www.mongodb-is-web-scale.com/

copsarebastards11y ago

https://twitter.com/mongodbfacts

weddpros11y ago

Then how on earth is it possible to build Parse.com with MongoDB?

http://blog.parse.com/announcements/mongodb-rocksdb-parse/

sklivvz197111y ago

Maybe you didn't get the joke. The joke is not about MongoDB, but about MongoDB fanbois that care only about some very narrow definition of "performance".

The wider message is that DBs are way more complex beasts that is meaningful to test this way.

Obviously there are cases in which MongoDB is a great choice, but equally obviously tests like this should not be a reason for the choice.

1 more reply

weavie11y ago· 3 in thread

> I expect to be able to click on a table and see the rows inside - like all the tools out there for Mongo, MySQL, PostgreSQL, etc.

He does have a point there. It is slightly annoying having to type out a query when I just want to browse the data. (No real biggie though..)

neuminoOP11y ago

There are other tools for this, like Chateau: https://github.com/neumino/chateau

CHY87211y ago

It's not a good point, though. Writing a rudimentary tool of that type would take perhaps an hour.

weavie11y ago

Indeed, and typing out the query really doesn't take that long.

It does affect the initial experience though. The admin panel looks so slick I just assumed I would be able to click on the table and it would jump to the data. When it didn't I was surprised and initially blamed myself and fired up the console to see if there were any errors showing. It just affects the polish of the panel.

As I said though it is a minor point for an excellent product.

tinussky11y ago· 3 in thread

benchmark = marketing (more times than not)

cpeterso11y ago

Hence "benchmarketing".

threeseed11y ago

Everyone likes to see how their tool of choice performs against other tools.

And it's an important (albeit just one part) of deciding which product to use.

tinussky11y ago

I always say do your own benchmarking for your own use case.

The risk of a colored benchmark is quite high when benchmark is done by owner of product or by "fan" of product. With the exception of a well explained, clear benchmark that everyone can understand and reproduce easily.

2 more replies

lukev11y ago· 2 in thread

This is why a lot of commercial databases have a "you may not publish benchmarks" clause in the license.

It seems unfair and restrictive, but benchmarking is hard, and even where users get everything else right every data load is different.

It's easy to see why companies don't want writeups like this one dominating searches for "<database> performance".

userbinator11y ago

It seems even more unfair and restrictive to do that.

If you prohibit benchmarks, it's essentially saying "we have something to hide" (like bad performance...), or "we don't want competition".

I think it's far better to point out the flaws and specifics than to censor any attempt at comparison.

mejari11y ago

> it's essentially saying "we have something to hide" (like bad performance...), or "we don't want competition"

This seems a totally imaginary dichotomy when literally the comment you're replying to presents an alternative option, namely; "We don't want you to publish things that are almost always going to be wrong and misleading"

resonation11y ago· 1 in thread

...forgot to call run().

threeseed11y ago

It was rerun with that call: https://www.amon.cx/blog/rethinkdb-reviewed-by-a-mongo-fan/

Still we don't know the hardware, what versions were installed, configuration changes etc.

The whole thing is pretty meaningless.

JDDunn911y ago· 1 in thread

Would be nice to see some official benchmarks from RethinkDB. As this illustrates, there aren't any good ones out there.

insaneirish11y ago

> Would be nice to see some official benchmarks from RethinkDB.

Of some artificial scenario that people will claim does not reflect [their] reality?

> As this illustrates, there aren't any good ones out there.

The only good benchmarks are the ones that you define to be representative of something you'll actually see in reality.

webtards11y ago· 1 in thread

Not sure what's worse here - people relying on third party benchmarks (hint: always do your own; see how a tool performs on your data, on your hardware, for your problem set), or the fanboy-ish panic when they are unsettled that a benchmark might make their chosen toy less shiny?

mikecmpbll11y ago

The former :)

kalleboo11y ago

From the linked benchmark:

> In RethinDB, you have to create databases and tables manually and it will raise an exception if they already exist. Compared to MongoDB that could be an inconvenience for some(and me) - one of the things I find appealing in MongoDB is the fluid interaction with databases

... well at least now I don't feel so bad about having some old MySQL stuff still in production. MySQL already has too much "fluidity" in dealing with my data...

dantiberian11y ago

I've been working with RethinkDB recently on some slightly unusual things and the Rethink team has been first class. They've got great support on IRC and GitHub and are open and friendly. I highly recommend them.

mikecmpbll11y ago

Any output like this, unless maliciously fallacious, is contributing in some way to the general understanding of the software concerned and benchmarking best-practices, even through its mistakes.

It's the job of the reader to judge their sources wisely, and interrogate what they read, rather than the job of the author to conduct their explorations in private.

Understandably, it can be frustrating for people involved in the projects but that's just the nature of the beast. They can do things to help their cause by championing good examples of benchmarking, even those which don't look upon them favourably.

j / k navigate · click thread line to collapse

57 comments

45 comments · 12 top-level

weddpros11y ago· 14 in thread

timmaxw11y ago

> the author has corrected the discrepancies since then

> No, but at least he tried...

weddpros11y ago

Some mistakes in his benchmarks, among probably others:

- I don't see any mongodb index creation, so mongodb is inserting with no index while rethinkdb is inserting with the index. That's probably why there's a gap between the two

- there's no mongodb index, and rethinkdb queries do not make use of the index (this is probably why rethinkdb is not 1000x slower: both aren't using indexes)

- the $in query should be last_update: random_timestamp(), there's no need for $in here

- his insertion code creates 100K memory clones of the object to insert in the mongodb version only, not in rethinkdb

But yes, the author may use some help!

1 more reply

arielweisberg11y ago

RethinkDB needs to ship its own benchmark client. Also implement an ugh YCSB driver. Provide both with the database download.

It's madness to expect someone new to database benchmarking to implement a correct fully featured benchmark client. They are going to stumble enough on database and instance configuration as it is.

geocar11y ago

> Maybe we need to start a "best practice" of checking in with the maintainers of a project before publishing benchmark results about the project.

http://en.wikipedia.org/wiki/David_DeWitt

fche11y ago

@threeseed

> [...] I've never used a database that required me to explicitly define which indexes I want to use for a read. [...]

In a way, it's traditional (IBM IMS/DB, 1960s).

erbdex11y ago

+1 on 'Profiling best practice'. Any such existing project?

threeseed11y ago

1 more reply

copsarebastards11y ago

So basically you're saying that because he cares and because he tried, and because the comments were dumb, nobody should criticize him? Do you really not care about getting accurate results?

Your response reminds me of this: https://www.youtube.com/watch?v=gSjLiQxEZlM

steego11y ago

> So basically you're saying that because he cares and because he tried, and because the comments were dumb, nobody should criticize him?

I don't think anybody is saying that.

1 more reply

JDDunn911y ago

You must be fun at parties...

1 more reply

sklivvz197111y ago

I suppose the author of the test forgot the "only test realistic scenarios" best practice.

weddpros11y ago

We know nothing about the author's use case...

Maybe he wanted to know where each DB shines compared to each other, to see if some workloads are better suited to one or the other.

Of course, benchmarks "should" include concurrent reads/updates/writes/deletes because it can make a huge difference depending on the DB's implementation.

DannoHung11y ago

> "do your own benchmark" best practice

sarciszewski11y ago

I have to say, it's rare for me to agree with a HN comment as much as I do to this one. Well said!

jakozaur11y ago· 5 in thread

Benchmarking is hard and a lot of reports are bogus. However they are still very useful for a lot of developers.

Benchmarking programming languages got better. E.g.: http://benchmarksgame.alioth.debian.org/ gives roughly idea about performance of programing languages.

I wish something similar existed for databases. I think exact figures would be hard to get, but I believe there are many 2x 10x differences that we should be aware of.

boomlinde11y ago

> Benchmarking is hard and a lot of reports are bogus. However they are still very useful for a lot of developers.

amaranth11y ago

They're neat and all and it is called the benchmarks game but I wish they'd remove the ones that end up getting gamed like that.

igouy11y ago

>> Don't a lot of those benchmarks… <<

No.

>> …how fast they can all call out to PCRE. <<

No. Have you looked at the programs?

collyw11y ago

In my experience, the database is the best place to perform optimizations, not in the code. I guess a lot depends on what you are doing.

amelius11y ago

> Benchmarking programming languages got better.

I wish we got this kind of benchmark for the altjs solutions.

sklivvz197111y ago· 3 in thread

This old comic piece seems to still apply: http://www.mongodb-is-web-scale.com/

copsarebastards11y ago

https://twitter.com/mongodbfacts

weddpros11y ago

Then how on earth is it possible to build Parse.com with MongoDB?

http://blog.parse.com/announcements/mongodb-rocksdb-parse/

sklivvz197111y ago

Maybe you didn't get the joke. The joke is not about MongoDB, but about MongoDB fanbois that care only about some very narrow definition of "performance".

The wider message is that DBs are way more complex beasts that is meaningful to test this way.

Obviously there are cases in which MongoDB is a great choice, but equally obviously tests like this should not be a reason for the choice.

1 more reply

weavie11y ago· 3 in thread

> I expect to be able to click on a table and see the rows inside - like all the tools out there for Mongo, MySQL, PostgreSQL, etc.

He does have a point there. It is slightly annoying having to type out a query when I just want to browse the data. (No real biggie though..)

neuminoOP11y ago

There are other tools for this, like Chateau: https://github.com/neumino/chateau

CHY87211y ago

It's not a good point, though. Writing a rudimentary tool of that type would take perhaps an hour.

weavie11y ago

Indeed, and typing out the query really doesn't take that long.

As I said though it is a minor point for an excellent product.

tinussky11y ago· 3 in thread

benchmark = marketing (more times than not)

cpeterso11y ago

Hence "benchmarketing".

threeseed11y ago

Everyone likes to see how their tool of choice performs against other tools.

And it's an important (albeit just one part) of deciding which product to use.

tinussky11y ago

I always say do your own benchmarking for your own use case.

2 more replies

lukev11y ago· 2 in thread

This is why a lot of commercial databases have a "you may not publish benchmarks" clause in the license.

It seems unfair and restrictive, but benchmarking is hard, and even where users get everything else right every data load is different.

It's easy to see why companies don't want writeups like this one dominating searches for "<database> performance".

userbinator11y ago

It seems even more unfair and restrictive to do that.

If you prohibit benchmarks, it's essentially saying "we have something to hide" (like bad performance...), or "we don't want competition".

I think it's far better to point out the flaws and specifics than to censor any attempt at comparison.

mejari11y ago

> it's essentially saying "we have something to hide" (like bad performance...), or "we don't want competition"

resonation11y ago· 1 in thread

...forgot to call run().

threeseed11y ago

It was rerun with that call: https://www.amon.cx/blog/rethinkdb-reviewed-by-a-mongo-fan/

Still we don't know the hardware, what versions were installed, configuration changes etc.

The whole thing is pretty meaningless.

JDDunn911y ago· 1 in thread

Would be nice to see some official benchmarks from RethinkDB. As this illustrates, there aren't any good ones out there.

insaneirish11y ago

> Would be nice to see some official benchmarks from RethinkDB.

Of some artificial scenario that people will claim does not reflect [their] reality?

> As this illustrates, there aren't any good ones out there.

The only good benchmarks are the ones that you define to be representative of something you'll actually see in reality.

webtards11y ago· 1 in thread

mikecmpbll11y ago

The former :)

kalleboo11y ago

From the linked benchmark:

... well at least now I don't feel so bad about having some old MySQL stuff still in production. MySQL already has too much "fluidity" in dealing with my data...

dantiberian11y ago

mikecmpbll11y ago

Any output like this, unless maliciously fallacious, is contributing in some way to the general understanding of the software concerned and benchmarking best-practices, even through its mistakes.

It's the job of the reader to judge their sources wisely, and interrogate what they read, rather than the job of the author to conduct their explorations in private.

j / k navigate · click thread line to collapse