What MongoDB got right (opens in new tab)

(blog.nelhage.com)

144 pointsreqres10y ago110 comments

110 comments

65 comments · 12 top-level

s_kilk10y ago· 20 in thread

> Let's start with the simplest one. Making the developer interface to the database a structured format instead of a textual query language was a clear win.

I think this is the most significant factor, by far. With Mongo it's turtles (or at least Maps/Hashes) all the way down, without a strange pseudo-english layer near the bottom that forces you to translate back and forth. For some devs that's a big deal.

For the last while I've been experimenting with bringing the same feature to PostgreSQL (http://bedquiltdb.github.io), turns out it's very do-able, but I don't have enough time to make it as featureful as it needs to be.

dbattaglia10y ago

Maybe I'm weird and just like SQL but the query-as-data aspect of Mongo is actually my least favorite aspect of using it. Ad-hoc queries become torture, as do complicated queries. I might just be lucky to have used SQL Server and C# for so long which eases a lot of the pains you described.

lloyd-christmas10y ago

It's pretty easy to build wrappers for things like that, though. Build a constructor that has instances methods of just doing `myQueryInstance.addEqualsQuery(key, value)`. You can then just pass it around and add to the query instance as it moves around, deconstructing it at your database access point.

meowface10y ago

Same. I wish there was a DSL layer on top.

1 more reply

collyw10y ago

SQL is still one of the most readable languages in my opinion. Its the one language where I find it easier to read queries than write them.

jerf10y ago

SQL's fine on its own. The problem arises when you try to manipulate it. A query to get messages that were sent between now and a week ago, joining in the user table for both the sender and the receiver to get their full names, is sensible enough when written out as SQL. I can quibble about some of the affordances of SQL, but it's good enough for most queries. The problem is if you try to create some sort of data that represents "get the message body", "messages sent since last week", "get the sender's name", "get the receiver's name", and somehow programmatically assemble a query out of the pieces. (Slight quibble, this isn't actually about OO, this is about "anything that isn't SQL". A Haskell library will have the exact same problem.) An AST-based approach doesn't guarantee that this will be easy, but it's still a step above trying to assemble an SQL query from those pieces.

It's totally possible. There's a ton of libraries that do it in various ways. It's just not even remotely easy, like it should be. If you look into the inside of those libraries you'll find a C'thulian monstronsity of special cases interacting with special cases until the whole thing just explodes into a brain-consuming mess, because SQL was very, very clearly not written for this use case.

In another 10 or 20 years I look forward to data-based analysis that tries to determine how much of the "NoSQL" movement was because the relational data model doesn't work for all use cases, and how much of it was the entirely accidental (in the Brooks sense) problems with A: SQL, the language itself, not its capabilities and B: schema migrations with no essential reason to be as painful as they are. (And something something column stores, but I'm not sure where they fit into this story exactly.) And to be clear on tone, I really am interested. I'm pretty sure the answer won't be either extreme but I'm pretty uncertain about where in the middle we'll fall.

2 more replies

s_kilk10y ago

I should clarify, I'm a big fan of SQL and relational DBs, but it's hard to deny that some/many devs find the abstraction boundary a bit weird.

And really, when you think about it, it is a bit weird, (we send sentences of almost-english toward the DB, then get structured data back) and there's nothing wrong with that.

But many devs do value the ability to stay in hash-map land all day without needing to think about how to cross an abstraction boundary into the DB, and MongoDB actually has a pretty cool solution there.

pmelendez10y ago

SQL is fine, no problem with it unless you are embedding queries in a OO application and you are forced to hack mapping layer between the two interfaces, which gives you two options, either compromise performance for the sake of development easiness or vice versa.

2 more replies

anton_gogolev10y ago

> Making the developer interface to the database a structured format instead of a textual query language was a clear win.

No, it was not. It is an abomiation, just like SharePoint CAML Query or any other "express-your-query-as-an-AST" approach. Sure, it's a paradise for Buzzword-Compliant Fluent Interface Enterprise Query Builder-type libraries.

> Querying SQL data involves constructing strings...

No. Case in point: LINQ. How it is implemented is irrelevant to the statement above.

> ...and counting arguments very carefully

No. Named parameters to the rescue.

dkhenry10y ago

LINQ is just an "express your query as an AST" approach. Sure LINQ presents its self as a DSL on top of that AST ( which is exactly what SQL is at the end of the day ), but there are libraries for MongoDB that allow you to do the same thing. There is even a LINQ driver for MongoDB[1]. What you are missing is that by having the lowest common denominator be the AST instead of a DSL you have removed a huge burden from both the server and client side of the operation. From that AST you can build any kind of client side interface without worry about how to parse or compile the result into a specific textual language, even plain old SQL[2] can be used if you like your sever spending its clock cycles parsing text.

1. http://mongodb.github.io/mongo-csharp-driver/2.1/reference/d... 2. https://www.progress.com/connectors/mongodb

1 more reply

pmelendez10y ago

> or any other "express-your-query-as-an-AST" approach

Well SQL is just an AST abstraction, which was made to be human readable. And that's exactly the issue, SQL is excellent for humans to express querys but really inconvenient to interface with other programs unless you express every single query ad-hoc (which seems the way most people choose since ORM tend to perform poorly)

gedrap10y ago

If you've been programming and using SQL for a few years, it's not a problem. It might take a bit of time to get the feeling of SQL, but it's ok later. And that's fine. rails-15min-blog things are cool and all, but some things have a learning curve and that's ok.

The vast vast majority of SQL in the wild is pretty straightforward and sucks only when the schema sucks (can't blame language for that). Most of the time people hate SQL, they do so because they are working with crap schema (whether someone else created it or they themselves didn't think it through enough). Sane schema + putting a bit of effort write readable SQL can go a long long way.

snaily10y ago

Given how quickly software engineering as a discipline moves, it's not unreasonable that most users at any given point in time (for any given library) are in those first few years.

Optimizing for the "newbie" case is not a failure.

1 more reply

nemothekid10y ago

>If you've been programming and using SQL for a few years, it's not a problem.

Years of SQL injection mitigation don't agree with you. Are you sure you just don't prefer it because you are familiar with it?

1 more reply

yoklov10y ago

Not a fan of mongo's query format/language, but I have to agree. It's always bugged me that so much work has gone into relational databases, and yet the only way anybody interfaces with them is by using an extremely quirky programming language from the 70's (or whenever).

gedrap10y ago

>>> extremely quirky programming language from the 70's (or whenever)

Because it doesn't have to be shinny and all, it just works. There are no widely used alternatives, so we can say that people are happy-enough using it. See the situation with JS. CoffeeScript and all, it's quite popular. Is there an analogy (in terms of functionality and popularity) in SQL? Nope. No one bothered to do it well enough, or not enough people found that it's worth enough to learn it.

We don't need to reinvent something every few years just because new JS frameworks come out and others are forgotten every few months.

collyw10y ago

Well you do have MS Access query builder, which I used for a while. Once you got used to it, it was maybe a bit faster than using real SQL. But then it locked you into the query builder way of thinking, and didn't provide very much beneft over textual SQL.

xlm171710y ago

For this dev, it's a big deal. I love writing a query in MongoDB way more than I love writing a query in MySQL.

Granted, I think a big part of it has to do with joins being the annoying part of SQL...

acjohnson5510y ago

What's annoying about joins? I think practically speaking, whether you're going relational or not, joins are a reality of a data model that supports multiple views and transformations. Assuming that's the case, the only question is whether you're doing joins ad hoc in your application code or you've learned the handful of variations that work natively in your RDBMS.

collyw10y ago

That seems to add to the idea that NoSQL fans just don't know how to use SQL properly.

2 more replies

ThePhysicist10y ago

I actually built a MongoDB query interface for SQL via Python+SQLAlchemy. It allows you to query your relational database just like you would with MongoDB, and in addition use some things (e.g. Joins in queries or "deep querying") that are not possible using MongoDB.

Here's the repo:

https://github.com/adewes/blitzdb

The implementation is stable but I'm still working on finishing Python3 support and documentation.

bryanlarsen10y ago· 10 in thread

"So while MongoDB today may not be a great database, I think there's a good chance that the MongoDB of 5 or 10 years from now truly will be."

Either MongoDB will be, or other databases that have learned the lessons, both good and bad, of MongoDB.

RethinkDB appears to have captured the "MongoDB done right" mindshare, and PostgreSQL has gained JSON and is gaining better replication in order to cover the same niches.

infomofo10y ago

I agree- I was a huge fan of MongoDB when it came out because of the unique data structures it enabled easily. However when it came time to select a new database for my new project, I found that the JSON support that PSQL had added gave me all the flexibility I needed while still in a somewhat relational form, and additionally it is dead simple to spin up postgres RDS instance in AWS, and it's a pain to use Mongo there.

annnnd10y ago

> ...instance in AWS, and it's a pain to use Mongo there

Why is that?

Quick google search doesn't hint at problems, but rather at pretty slick marketing pages: (which doesn't mean much, I know) https://aws.amazon.com/blogs/aws/mongodb-on-the-aws-cloud-ne...

3 more replies

threeseed10y ago

How can MongoDB be a pain in AWS ? It is the easiest database in the world to setup. Download and run ./mongod. I've setup plenty of them in AWS and had zero issues.

And if we are talking about managed databases then it's equally dead simple to spin up a Compose/MongoLab instance in AWS.

1 more reply

tracker110y ago

Agreed, and now that RethinkDB supports automagic failover, it's pretty much a no brainer... and while I like Mongo's query interface slightly more for most queries, RethinkDB avoids some of the weirdness when you have more interesting queries. And server-side collation/joins is a really nice feature in a document-centric database.

PostgreSQL really needs a MUCH better replication/sharding/failover story... While I would use PostgreSQL in a situation where all I need/want is a single server, where multiple servers are needed for HA/failover, I'd probably just defer to MS-SQL, only because pg is so convoluted in that regard.

As to MySQL/Maria... I haven't touched it in years, and every time I have some weird behavior drives me nuts. I find it funny that people can love mysql, and bash on JS.

I'd also like to acknowledge ElasticSearch and Cassandra... ES is wonderful to work with for what it does best, search, and C* is a champ when you need a really good distributed table/kv store, though I think that RethinkDB is a better option today, if you don't need more than 10-20 nodes (which is a LOT).

Thaxll10y ago

PostgreSQL is nowhere near Clustering / HA / sharding features. Afaik it's only a master / slave architecture by default.

jpgvm10y ago

Sure, but the underlying primitives are better. Specifically it supports quite flexible binary replication including the new logical replication features. With this you can build things like Manatee[1] which enable effectively seamless HA. The platform I am currently working on Flynn[2] uses a variant of this state machine to implement an effectively maintenance free Postgres cluster.

In future though there will be support for bi-directional replication in Postgres, i.e true multi-master support.

[1] https://github.com/joyent/manatee [2] https://flynn.io/

threeseed10y ago

> RethinkDB appears to have captured the "MongoDB done right" mindshare

Mindshare is irrelevant. MongoDB is killing it in the enterprise right now. They have integration with Oracle, Teradata, Hadoop and countless partnerships with other vendors. You can guarantee MongoDB will still be around in 20 years the way it is positioning itself. Can't say the same about RethinkDB (as great as it is).

> PostgreSQL has gained JSON and is gaining better replication in order to cover the same niches

The PostgreSQL replication story is pretty pathetic given how old/mature it is. And I've seen nothing to suggest that anything is really improving in this area. There are a range of addons none of which are supported or built in. Basic replication is confusing, the documentation non existent in parts and good luck getting any support.

You compare it to MongoDB (or really any of the newer NoSQL databases) and it's like night and day. It takes minutes to setup a replica set and there is plenty of documentation and official support for any issues.

elithrar10y ago

> You compare it to MongoDB (or really any of the newer NoSQL databases) and it's like night and day. It takes minutes to setup a replica set and there is plenty of documentation and official support for any issues.

MongoDB makes the operational side of replication easy, but handwaves a safe, functioning implementation: https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-read...

virmundi10y ago

It really is killing it in the Enterprise, and I'm trying to do my part to remove it. I'm at a client that wants to use MongoDB. It's on the approved product list. They have little to no experience with it.

Every chance I get, I advocate ArangoDB. It also is Mongo Done Right. You get joins, graphs and a thoughtful future plan from ArangoDB team. To help bridge the gap I've written an ArangoDB Hadoop connector [1]. Unlike the MongoDB one, you can read and write.

I've also added better Clojure support to it: from a driver to a Ragtime migrator.

Sadly as it stands Mongo has a better Ops story than ArangoDB. Until that improves, I don't think that ArangoDB will make it into many Fortune 1000's outside of some small prototype style applications. Maybe micro-services in the enterprise will change this, but I don't think a large insurance company wants to support multiple database standards in general, and definitely not within a family.

1 - https://github.com/deusdat/guacaphant

2 more replies

bryanlarsen10y ago

Today's mindshare is tomorrow's market share. It's not assured, but there's a strong correlation. Conversely, lack of mindshare doesn't really hurt sales, but it does hurt growth.

PostgreSQL 9.4, 9.5 and 9.6 all introduce foundational changes to make eventually enable replication, but none of it is really exposed to the end user. They are working on it, but they are being very conservative.

2 more replies

krisdol10y ago· 6 in thread

I don't understand the recent backlash against NoSQL here.

First off, almost all of the complaints would have been valid years ago. Secondly, there is so much more choice out there today if mongodb wasn't the right answer for your project, and so many NoSQL stores have had time to mature and get polished APIs and docs.

We use various data stores for different purpose across microservices, mostly ES, couchbase, and datomic, and "use the right tool for the job" and "do one thing and do it well" feels like the right approach to take. For most applications, a SQL DB feels like a really big hammer that is put to a lot of things that don't look like nails.

Cshelton10y ago

Basically, NoSQL became the big trending thing, everyone was pushing it hard simply because other people were as well. Because of this, many people used a NoSQL database when either a.) They were using the wrong technology for the problem they needed to solve or b.) They have very little/no experience with databases and working with data in general and they really screwed themselves up. Then they took to the forums and went on NoSQL crusades.

Nothing is wrong with NoSQL, used correctly and for the right purpose, it is AMAZING.

jpgvm10y ago

"For most applications" is very very misleading.

Most applications believe it or not are business modelling problems, which are overwhelmingly relational. SQL was invented to solve these, so no surprise it is actually the best tool for the job by far.

gedrap10y ago

>>> use the right tool for the job" and "do one thing and do it well" feels like the right approach to take

Absolutely. However, database is a sort of an extreme example. A lot in the modern software (especially Web) relies on the database, and often migrating to completely different one (because requirements change and it might not be the right tool anymore) is a huge task. So you want to use something flexible enough.

Also, you want to hire people, people leave the jobs, people change teams, etc. If you use some exotic, less common DB, it adds a lot of overhead. And if you apply the "right tool" to an extreme and have a few completely different DBs flying around, your maintenance cost increases a lot.

See, SQL might not be a perfect, most elegant choice, but most often it is just good enough. A lot of people have used it, a lot of people have scaled it. If you run into an issue, often enough,other people did too and blogged about it, etc. Hiring / getting help will be much easier than $insertNoSQLDBName.

And, let's be realistic, relatively few companies have hundreds of gigabytes or terabytes of data that typical relational DBs can't handle.

My rule of thumb is that if you're in doubt, use SQL/relational store (I realize that they are different things but often used as synonyms and mean MySQL/PostgreSQL/etc).

yummyfajitas10y ago

The main problem MongoDB solves is "I don't want to learn SQL". The backlash is against this use case.

(This article certainly seems to be appealing to this use case, c.f. "counting arguments really carefully".)

bsg7510y ago

Alternatively, "I want to do everything in JavaScript", and not learn any other languages.

A lot of recent "innovation" is mislabeled laziness.

jeffdavis10y ago

"Do one thing and do it well" is problematic for things that store a lot of data. Especially for things that are supposed to be an authoritative source.

Getting storage right is very hard. Either it's too low-level, and it's hard for applications to coordinate complex operations without corrupting data; or you end up putting a lot of features in and end up with a SQL dbms; or everything does its own storage and you have a mess.

ngrilly10y ago· 5 in thread

I agree that the three areas outlined in the article are things that MongoDB got right: a structured query language (instead of a textual query language), replica sets, and the oplog.

But the lack of transactions over multiple documents (in the same shard at least) and the lack of joins over multiple collections are a big showstopper for the kind of applications I develop.

I note that solutions like YouTube's Vitess provide something similar to MongoDB's replica sets.

I also note that PostgreSQL's logical decoding provide the same functionality than MongoDB's oplog tailing.

s_kilk10y ago

> a structured query language (instead of a textual query language)

Oh crowning irony of ironies, SQL literally means "Structured Query Language". :)

ngrilly10y ago

LOL. Yes, I realized this after pressing the submit button :-)

progx10y ago

Always wonder what kind of simple apps most people must write, if they not need joins?

I will be happy if i got such simple tasks :)

threeseed10y ago

How exactly do you think eBay, GMail, Facebook etc work ? They aren't relying on relational database joins.

If you want to write a truly scalable application you structure everything such that you do joins in your application layer.

http://highscalability.com/ebay-architecture

And in the case of MongoDB you avoid joins since it is a document database. You embed data instead.

1 more reply

maze-le10y ago

There is always the possibility to make a map/reduce over multiple datasets. It is not exactly a replacement for join, but you can cover aggregation over multiple datasets with a output-collection. Well... in my opinion this is much more complicated and error-prone than a join-operation on a relational database.

yummyfajitas10y ago· 3 in thread

Counting arguments very carefully? Nearly every SQL library does this for you.

    cur.execute("INSERT INTO a (b,c) VALUES (%(a)s, %(b)s);",
        { 'a' : a, 'b' : b })

Also, SQL is typed, so even if you did fail to count arguments there is a good chance you'd just detect it the first time you ran it.

The article acts as if treating the DB like native structures is somehow innovative and new - it's not. https://en.wikipedia.org/wiki/Object_database

We mostly abandoned object databases because they sucked. SQL was a huge improvement over them. SQL is a great way to organize and preserve the integrity of a lot of business data.

It's also a fantastic way to avoid repeated trips to the DB:

    SELECT * FROM employees AS e 
        WHERE e.department_id = (SELECT id FROM departments WHERE name = "engineering");

In Mongo, I'm pretty sure you need to first lookup engineering, then lookup the employees in engineering. That could be O(# employees in engineering) queries rather than 1.

acjohnson5510y ago

> In Mongo, I'm pretty sure you need to first lookup engineering, then lookup the employees in engineering. That could be O(# employees in engineering) queries rather than 1.

Or, you could denormalize, and give yourself all sorts of future headaches maintaining data integrity.

lloyd-christmas10y ago

> In Mongo, I'm pretty sure you need to first lookup engineering, then lookup the employees in engineering. That could be O(# employees in engineering) queries rather than 1.

The problem with that summary boils down to bad architecture. The point of document storage is storage with purpose; the intent being to make querying EASIER. This could easily be structured to be a single query. You can structure a document countless ways to represent that query, all of them would likely be different based on the purpose of the app.

yummyfajitas10y ago

Whereas with SQL there is more or less a single canonical way to do it and it's mostly independent of the app. I.e. the data design is minimally coupled to the specific use cases.

Right now I'm building a data store and I don't know the app(s) that's are going to be built on it.

It would be really great if computing could stop forgetting it's history. Object databases failed for a reason.

1 more reply

emilburzo10y ago· 3 in thread

I have to agree with the author, especially since the points he raises are the ones that helped me greatly on my first "serious" personal project[1].

Coming from postgresql land I would have never thought you can have such great replication with automatic failover. I've had literally 100% uptime for the past year.

And that's on commodity servers (one of them being in a room in my apartment, the other two in a proper datacenter) going through the usual upgrades, downtime, reboots, going from mongo 2 to mongo 3 and such.

Speaking of which, the migration from mongo2 to mongo3 was another pleasant surprise: they've made it backwards compatible. So I could do the upgrade on the servers, one by one, checking everything was ok and after that I could focus on updating the drivers and rewriting the deprecated queries, no need to have everything ready at once.

The accessible oplog was another gem that fit my project really well. Gone was the need to poll the database, I could just "watch" the oplog. That, coupled with long polling on the browser side meant I'd have very little chatter between the db/server/web client when idle. Websockets would have been nice, but adoption wasn't high enough that I'd be comfortable going forward with it.

And all this considering MongoDB was my first NoSQL experience.

I agree it doesn't fit every project, but when it does, it's a really nice experience.

[1] https://graticule.link/

ngrilly10y ago

I agree that MongoDB has a great replication story.

But I don't understand that part:

> The accessible oplog was another gem that fit my project really well. Gone was the need to poll the database, I could just "watch" the oplog.

Coming from PostgreSQL, you could do the same using LISTEN/NOTIFY?

emilburzo10y ago

> Coming from PostgreSQL, you could do the same using LISTEN/NOTIFY?

I have to admit I was not aware of this feature.

However, from the docs[1]:

> Commonly, the channel name is the same as the name of some table in the database, and the notify event essentially means, "I changed this table, take a look at it to see what's new".

From what I understand, you just know that something has changed, the actual change is not included in the event, so you need at least another query to see what changed.

Did I understand correctly?

In MongoDB you get the operation (insert, update, delete), the document and another few details right in the event.

[1] http://www.postgresql.org/docs/9.4/static/sql-notify.html

2 more replies

count10y ago

https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-read...

I don't know that it's story is so great. EASY, sure, but what good is easily replicating bad data?

1 more reply

rwmj10y ago· 2 in thread

Just a note that in PG'OCaml (an OCaml interface to PostgreSQL), you can write:

    "insert into foo (col1,col2,col3) values ($a, $b, $c)"

and it creates the safe prepared statement with ? placeholders. At compile time. Type-checked against the database to make sure your program types match your column types.

http://pgocaml.forge.ocamlcore.org/

annnnd10y ago

I would be very careful with such SQL statements. I am guessing it relies on some intrinsic fields' order? That could change anytime. Order of fields shouldn't have any impact on you app, but I think in your case it does.

rwmj10y ago

The "..." wasn't literal. I have amended the post to make this clear.

bsg7510y ago· 2 in thread

> You can argue, and I would largely agree, that this is actually part of MongoDB's brilliant marketing strategy, of sacrificing engineering quality in order to get to market faster and build a hype machine, with the idea that the engineering will follow later.

Author nearly lost me here with this logic. Placing Marketing ahead of quality in something that is supposed to store a very valuable asset (data) is near insanity.

I get the mindset of "break fast", "release often", etc. in terms of customer facing features, but in something that is supposed to be a core part of your foundation, stability is if utmost importance. Otherwise nothing else works - and you lose customers, business, opportunities - because you can't look them up later.

Its not "brilliant marketing", its just marketing.

smacktoward10y ago

This is all true, but the success of MySQL shows pretty clearly that just because something is insane doesn't mean it's not good business.

bsg7510y ago

I think the success of MySQL is due to there being fewer options for a period of time (the "dot.com boom"), and thus it became a popular choice to avoid commercial RDBMS costs.

I'm no MySQL fan when things like PostgreSQL are an option, but its probably more sane than some other currently popular choices.

_yy10y ago· 2 in thread

RethinkDB took all the good parts of MongoDB and added proper engineering.

https://www.rethinkdb.com/

ngrilly10y ago

But still no transactions over multiple documents (at least in the same shard)?

jmakeig10y ago

ACID transactions in a highly available distributed system are hard and often fail in subtle ways when done wrong at the edges. Any implementation will take years to mature in the lab and in actual production usage. This isn’t a knock on the Rethink guys; their product looks pretty awesome and is moving quickly.

For a solution today, MarkLogic is a transactional distributed document database. Cross-document and cross-partition transactions have been a key tenet of the architecture from the beginning (like, 2002 beginning). Take a look at https://developer.marklogic.com/blog/how-marklogic-supports-... for details.

Full disclosure: I’m a Product Manager at MarkLogic.

angelbob10y ago

I love the point about the Oplog.

There are a few equivalents for common SQL DBs (see LinkedIn's Databus for Oracle and MySQL), but in general, getting access to the write log is really hard. Even though it's sitting there!

It would be wonderful if there were some kind of established API or library that would let you parse the MySQL write log without doing hideous, fragile operations that change from version to version. Sure, change the format, but at least version and document it!

sriku10y ago

When we chose MongoDB for a project, a dominant criterion was out of the box geo queries. It helped that the storage and query approach had good impedance match with NodeJS. From a query perspective, we wouldn't have benefited much from SQL anyway, since much of the reading is free text or social graph or location based search which we moved to Solr.

franzwong10y ago

It becomes much simpler to setup replication in PostgreSQL than before.

reference: https://www.digitalocean.com/community/tutorials/how-to-set-...

j / k navigate · click thread line to collapse

110 comments

65 comments · 12 top-level

s_kilk10y ago· 20 in thread

> Let's start with the simplest one. Making the developer interface to the database a structured format instead of a textual query language was a clear win.

dbattaglia10y ago

lloyd-christmas10y ago

meowface10y ago

Same. I wish there was a DSL layer on top.

1 more reply

collyw10y ago

SQL is still one of the most readable languages in my opinion. Its the one language where I find it easier to read queries than write them.

jerf10y ago

2 more replies

s_kilk10y ago

I should clarify, I'm a big fan of SQL and relational DBs, but it's hard to deny that some/many devs find the abstraction boundary a bit weird.

And really, when you think about it, it is a bit weird, (we send sentences of almost-english toward the DB, then get structured data back) and there's nothing wrong with that.

pmelendez10y ago

2 more replies

anton_gogolev10y ago

> Making the developer interface to the database a structured format instead of a textual query language was a clear win.

> Querying SQL data involves constructing strings...

No. Case in point: LINQ. How it is implemented is irrelevant to the statement above.

> ...and counting arguments very carefully

No. Named parameters to the rescue.

dkhenry10y ago

1. http://mongodb.github.io/mongo-csharp-driver/2.1/reference/d... 2. https://www.progress.com/connectors/mongodb

1 more reply

pmelendez10y ago

> or any other "express-your-query-as-an-AST" approach

gedrap10y ago

snaily10y ago

Given how quickly software engineering as a discipline moves, it's not unreasonable that most users at any given point in time (for any given library) are in those first few years.

Optimizing for the "newbie" case is not a failure.

1 more reply

nemothekid10y ago

>If you've been programming and using SQL for a few years, it's not a problem.

Years of SQL injection mitigation don't agree with you. Are you sure you just don't prefer it because you are familiar with it?

1 more reply

yoklov10y ago

gedrap10y ago

>>> extremely quirky programming language from the 70's (or whenever)

We don't need to reinvent something every few years just because new JS frameworks come out and others are forgotten every few months.

collyw10y ago

xlm171710y ago

For this dev, it's a big deal. I love writing a query in MongoDB way more than I love writing a query in MySQL.

Granted, I think a big part of it has to do with joins being the annoying part of SQL...

acjohnson5510y ago

collyw10y ago

That seems to add to the idea that NoSQL fans just don't know how to use SQL properly.

2 more replies

ThePhysicist10y ago

Here's the repo:

https://github.com/adewes/blitzdb

The implementation is stable but I'm still working on finishing Python3 support and documentation.

bryanlarsen10y ago· 10 in thread

"So while MongoDB today may not be a great database, I think there's a good chance that the MongoDB of 5 or 10 years from now truly will be."

Either MongoDB will be, or other databases that have learned the lessons, both good and bad, of MongoDB.

RethinkDB appears to have captured the "MongoDB done right" mindshare, and PostgreSQL has gained JSON and is gaining better replication in order to cover the same niches.

infomofo10y ago

annnnd10y ago

> ...instance in AWS, and it's a pain to use Mongo there

Why is that?

Quick google search doesn't hint at problems, but rather at pretty slick marketing pages: (which doesn't mean much, I know) https://aws.amazon.com/blogs/aws/mongodb-on-the-aws-cloud-ne...

3 more replies

threeseed10y ago

How can MongoDB be a pain in AWS ? It is the easiest database in the world to setup. Download and run ./mongod. I've setup plenty of them in AWS and had zero issues.

And if we are talking about managed databases then it's equally dead simple to spin up a Compose/MongoLab instance in AWS.

1 more reply

tracker110y ago

As to MySQL/Maria... I haven't touched it in years, and every time I have some weird behavior drives me nuts. I find it funny that people can love mysql, and bash on JS.

Thaxll10y ago

PostgreSQL is nowhere near Clustering / HA / sharding features. Afaik it's only a master / slave architecture by default.

jpgvm10y ago

In future though there will be support for bi-directional replication in Postgres, i.e true multi-master support.

[1] https://github.com/joyent/manatee [2] https://flynn.io/

threeseed10y ago

> RethinkDB appears to have captured the "MongoDB done right" mindshare

> PostgreSQL has gained JSON and is gaining better replication in order to cover the same niches

elithrar10y ago

MongoDB makes the operational side of replication easy, but handwaves a safe, functioning implementation: https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-read...

virmundi10y ago

I've also added better Clojure support to it: from a driver to a Ragtime migrator.

1 - https://github.com/deusdat/guacaphant

2 more replies

bryanlarsen10y ago

Today's mindshare is tomorrow's market share. It's not assured, but there's a strong correlation. Conversely, lack of mindshare doesn't really hurt sales, but it does hurt growth.

2 more replies

krisdol10y ago· 6 in thread

I don't understand the recent backlash against NoSQL here.

Cshelton10y ago

Nothing is wrong with NoSQL, used correctly and for the right purpose, it is AMAZING.

jpgvm10y ago

"For most applications" is very very misleading.

gedrap10y ago

>>> use the right tool for the job" and "do one thing and do it well" feels like the right approach to take

And, let's be realistic, relatively few companies have hundreds of gigabytes or terabytes of data that typical relational DBs can't handle.

My rule of thumb is that if you're in doubt, use SQL/relational store (I realize that they are different things but often used as synonyms and mean MySQL/PostgreSQL/etc).

yummyfajitas10y ago

The main problem MongoDB solves is "I don't want to learn SQL". The backlash is against this use case.

(This article certainly seems to be appealing to this use case, c.f. "counting arguments really carefully".)

bsg7510y ago

Alternatively, "I want to do everything in JavaScript", and not learn any other languages.

A lot of recent "innovation" is mislabeled laziness.

jeffdavis10y ago

"Do one thing and do it well" is problematic for things that store a lot of data. Especially for things that are supposed to be an authoritative source.

ngrilly10y ago· 5 in thread

I agree that the three areas outlined in the article are things that MongoDB got right: a structured query language (instead of a textual query language), replica sets, and the oplog.

But the lack of transactions over multiple documents (in the same shard at least) and the lack of joins over multiple collections are a big showstopper for the kind of applications I develop.

I note that solutions like YouTube's Vitess provide something similar to MongoDB's replica sets.

I also note that PostgreSQL's logical decoding provide the same functionality than MongoDB's oplog tailing.

s_kilk10y ago

> a structured query language (instead of a textual query language)

Oh crowning irony of ironies, SQL literally means "Structured Query Language". :)

ngrilly10y ago

LOL. Yes, I realized this after pressing the submit button :-)

progx10y ago

Always wonder what kind of simple apps most people must write, if they not need joins?

I will be happy if i got such simple tasks :)

threeseed10y ago

How exactly do you think eBay, GMail, Facebook etc work ? They aren't relying on relational database joins.

If you want to write a truly scalable application you structure everything such that you do joins in your application layer.

http://highscalability.com/ebay-architecture

And in the case of MongoDB you avoid joins since it is a document database. You embed data instead.

1 more reply

maze-le10y ago

yummyfajitas10y ago· 3 in thread

Counting arguments very carefully? Nearly every SQL library does this for you.

    cur.execute("INSERT INTO a (b,c) VALUES (%(a)s, %(b)s);",
        { 'a' : a, 'b' : b })

Also, SQL is typed, so even if you did fail to count arguments there is a good chance you'd just detect it the first time you ran it.

The article acts as if treating the DB like native structures is somehow innovative and new - it's not. https://en.wikipedia.org/wiki/Object_database

We mostly abandoned object databases because they sucked. SQL was a huge improvement over them. SQL is a great way to organize and preserve the integrity of a lot of business data.

It's also a fantastic way to avoid repeated trips to the DB:

    SELECT * FROM employees AS e 
        WHERE e.department_id = (SELECT id FROM departments WHERE name = "engineering");

In Mongo, I'm pretty sure you need to first lookup engineering, then lookup the employees in engineering. That could be O(# employees in engineering) queries rather than 1.

acjohnson5510y ago

> In Mongo, I'm pretty sure you need to first lookup engineering, then lookup the employees in engineering. That could be O(# employees in engineering) queries rather than 1.

Or, you could denormalize, and give yourself all sorts of future headaches maintaining data integrity.

lloyd-christmas10y ago

> In Mongo, I'm pretty sure you need to first lookup engineering, then lookup the employees in engineering. That could be O(# employees in engineering) queries rather than 1.

yummyfajitas10y ago

Whereas with SQL there is more or less a single canonical way to do it and it's mostly independent of the app. I.e. the data design is minimally coupled to the specific use cases.

Right now I'm building a data store and I don't know the app(s) that's are going to be built on it.

It would be really great if computing could stop forgetting it's history. Object databases failed for a reason.

1 more reply

emilburzo10y ago· 3 in thread

I have to agree with the author, especially since the points he raises are the ones that helped me greatly on my first "serious" personal project[1].

Coming from postgresql land I would have never thought you can have such great replication with automatic failover. I've had literally 100% uptime for the past year.

And all this considering MongoDB was my first NoSQL experience.

I agree it doesn't fit every project, but when it does, it's a really nice experience.

[1] https://graticule.link/

ngrilly10y ago

I agree that MongoDB has a great replication story.

But I don't understand that part:

> The accessible oplog was another gem that fit my project really well. Gone was the need to poll the database, I could just "watch" the oplog.

Coming from PostgreSQL, you could do the same using LISTEN/NOTIFY?

emilburzo10y ago

> Coming from PostgreSQL, you could do the same using LISTEN/NOTIFY?

I have to admit I was not aware of this feature.

However, from the docs[1]:

> Commonly, the channel name is the same as the name of some table in the database, and the notify event essentially means, "I changed this table, take a look at it to see what's new".

From what I understand, you just know that something has changed, the actual change is not included in the event, so you need at least another query to see what changed.

Did I understand correctly?

In MongoDB you get the operation (insert, update, delete), the document and another few details right in the event.

[1] http://www.postgresql.org/docs/9.4/static/sql-notify.html

2 more replies

count10y ago

https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-read...

I don't know that it's story is so great. EASY, sure, but what good is easily replicating bad data?

1 more reply

rwmj10y ago· 2 in thread

Just a note that in PG'OCaml (an OCaml interface to PostgreSQL), you can write:

    "insert into foo (col1,col2,col3) values ($a, $b, $c)"

and it creates the safe prepared statement with ? placeholders. At compile time. Type-checked against the database to make sure your program types match your column types.

http://pgocaml.forge.ocamlcore.org/

annnnd10y ago

rwmj10y ago

The "..." wasn't literal. I have amended the post to make this clear.

bsg7510y ago· 2 in thread

Author nearly lost me here with this logic. Placing Marketing ahead of quality in something that is supposed to store a very valuable asset (data) is near insanity.

Its not "brilliant marketing", its just marketing.

smacktoward10y ago

This is all true, but the success of MySQL shows pretty clearly that just because something is insane doesn't mean it's not good business.

bsg7510y ago

I think the success of MySQL is due to there being fewer options for a period of time (the "dot.com boom"), and thus it became a popular choice to avoid commercial RDBMS costs.

I'm no MySQL fan when things like PostgreSQL are an option, but its probably more sane than some other currently popular choices.

_yy10y ago· 2 in thread

RethinkDB took all the good parts of MongoDB and added proper engineering.

https://www.rethinkdb.com/

ngrilly10y ago

But still no transactions over multiple documents (at least in the same shard)?

jmakeig10y ago

Full disclosure: I’m a Product Manager at MarkLogic.

angelbob10y ago

I love the point about the Oplog.

There are a few equivalents for common SQL DBs (see LinkedIn's Databus for Oracle and MySQL), but in general, getting access to the write log is really hard. Even though it's sitting there!

sriku10y ago

franzwong10y ago

It becomes much simpler to setup replication in PostgreSQL than before.

reference: https://www.digitalocean.com/community/tutorials/how-to-set-...

j / k navigate · click thread line to collapse