Ask HN: Why are there so many NoSQL databases?

20 pointskez16y ago15 comments

Back in the early 2000s it seemed like your choices for database-driven web sites were MySQL, PGSQL, BerkleyDB (for the Perlites) and maybe SQLite. Enterprises had their Oracle and SQL Server.

Now, when trying to get stuck into a bit of NoSQL/schema-free/document store databases for the web, I am overwhelmed by the number of options, and am struggling to understand the best one for the job.

Do people genuinely believe that the world needs this many NoSQL systems, or are we just in the infancy/resurgence of schema-free, and things are yet to settle down?

20 pointskez16y ago15 comments

Back in the early 2000s it seemed like your choices for database-driven web sites were MySQL, PGSQL, BerkleyDB (for the Perlites) and maybe SQLite. Enterprises had their Oracle and SQL Server.

Do people genuinely believe that the world needs this many NoSQL systems, or are we just in the infancy/resurgence of schema-free, and things are yet to settle down?

15 comments

15 comments · 12 top-level

simonw16y ago· 1 in thread

It's a cambrian explosion. A lot of the concepts involved (eventual consistency, CAP theorem, map reduce) are relatively recent innovations, so there's plenty of scope for exploring them with software. I imagine things will settle down eventually.

z800016y ago

You are consistently available for insights.

silentbicycle16y ago· 1 in thread

The new non-relational databases have fairly different designs. For example, if your data set would fit entirely in memory (on one or a few servers), Redis would probably be a great choice. Their different strengths come out of the design choices that set them apart.

A while ago, there were several different database query languages for relational databases, too. In interest of having a standard, they compromised on SQL. There are lots of version control systems, parsing frameworks, programming languages, etc., too. This isn't really unique to databases, they just get talked about more since there's so much buzz about hot new web development stuff.

silentbicycle16y ago

Huh, apparently my saying so is enough to cite this as fact in kez's blog.

One good source about relational databases (including their history) is _An Introduction to Database Systems_ by C.J. Date. The author has an axe to grind, but he's thorough, and there are plenty of other references cited should you want to dig deeper.

majke16y ago· 1 in thread

> Do people genuinely believe that the world needs this many NoSQL systems, or are we just in the infancy/resurgence of schema-free, ...

The non-SQL world is still pretty young. Well, the ideas themselves are old - but recent implementations try to solve unique problemsets.

> ... and things are yet to settle down?

Yes. IMO there would be 5-7 major projects supported by larger communities. Every of this projects will solve particular problem.

So, instead of having 2-3 general SQL providers, we can expect many solutions for very specific problems. The issue right now is that we don't really know what these problems are. Current NoSQL implementations are probing the market - answering the question if this specific features are useful for broader audience.

I think we can guess some of these 5-7 major specializations, for example:

- Memcachedb: Distributed K-V optimized for speed - no replication

- Distributed K-V optimized for reliability

- Distributed K-V optimized for size - like Dynamo.

- neo4j: Graph database

- redis: K-V with reach features, but limited to data size that fits in memory

- K-V framework created to allow Map-Reduce jobs - including scheduler, debugger and so on.

z800016y ago

FYI if you are into the bleeding edge, redis has a virtual memory implementation as of about 12 hours ago.

pierrefar16y ago

Because they're all different. We have key-value stores (Berkley and MemcacheDB), column stores (Cassandra), document stores (CouchDB, MongoDB), and even new data structures (Redis).

They all solve different types of problems (e.g. document stores vs key-value stores). Even similar databases solve the same problems differently (e.g. sharding). They have different performance profiles and bottlenecks. They give you different ways to model your data and query it. Some are persistent, some are not, and some are lazy persistent.

Big picture though: this is the first time your average startup/small team/individual hacker has needed a very scalable database solution because of websites. A website has the ability to get you a ton of users very quickly even if you are just one man hacking on a personal pet peeve (I went through this).

This kind of experimentation is awesome and it allows us to figure out what really works in what situations and is a sign of a very healthy community. I love being part of it.

nawroth16y ago

To answer your question I think things will settle down -- in the long run it's too hard for developers to deal with all the alternatives and their differences. To get a high-level overview of the NOSQL space you could read this blog entry: http://blogs.neotechnology.com/emil/2009/11/nosql-scaling-to... As Ben Scofield puts it (cited in that post): "NoSQL DBs often provide better substrates for modeling business domains". I think this aspect is often forgotten in the debate. So I'd say: start from your business domain, what are the characteristics of it? Then look for a DBMS that is a good fit. And to get down to the details of some of the NOSQL systems, here's a walk through: http://www.vineetgupta.com/2010/01/nosql-databases-part-1-la...

bradfordw16y ago

Because monoculture is bad and the more options we have, the more they all learn from one another (which drums up competition). In the end, like the other fellows on here have stated, you'll have a few emerge as the "standards" based on the type of problem you are trying to solve.

bitdiddle16y ago

I think people have been doing these things for years, in the past they were just more apt to be embedded in desktop applications and so forth. Issues with scaling for the web have changed the dynamics, so the new non-relational approaches are quite distinct from the earlier ones such as Statice, ObjectStore, and Ontos.

CouchDB is well worth a hard look mainly because it takes advantage of several new ideas all in a very simple stack.

In a year or two I predict two or three will emerge as clear choices for a few distinct scenarios.

jokull16y ago

Because if you look closely - they're all different. There's a great deal of feature overlap however. It's like the community collectively is throwing things at a wall and seeing what sticks. It's the healthiest way to eventually get the best. My bets are on redis and perhaps MongoDB.

jdp16y ago

NOSQL is definitely not a new idea, but the current favorite mode of access and interaction is (REST). I think the explosion in popularity for the creators is due to a lot of things, including: the ability to start an open source project in a new environment, giving it a real chance for wide adoption; the need to fill a niche, there are many different types of NOSQL stores; and to a lesser extent the perceived simplicity of such a project. For people using NOSQL stores in their projects, the attraction comes from the mix of shiny new technology and performance benefits, both real and perceived. It also helps that there are many different types each addressing a different requirement.

keefe16y ago

I'm a big fan of document databases. It was one of those things where I was working on my app, thinking to myself... self, aha! I need documents or arbitrary kvp storage... then thinking, yeah somebody else must have done that already and there I am on couchdb or whatever. Having to have schemas is just an unnecessary pain in the ass imho (sometimes necessary blah)

kezOP16y ago

Thanks for all the comments; I have put together a brief summary: http://www.justkez.com/why-are-there-so-many-nosql-options/

It's been very interesting reading the responses.

koenbok16y ago

It's a hip thing to work on, an interesting problem to solve, has some nice ideas around it, has great potential to get lots of users and there are no widely accepted solutions yet.

j / k navigate · click thread line to collapse

15 comments

15 comments · 12 top-level

simonw16y ago· 1 in thread

z800016y ago

You are consistently available for insights.

silentbicycle16y ago· 1 in thread

silentbicycle16y ago

Huh, apparently my saying so is enough to cite this as fact in kez's blog.

majke16y ago· 1 in thread

> Do people genuinely believe that the world needs this many NoSQL systems, or are we just in the infancy/resurgence of schema-free, ...

The non-SQL world is still pretty young. Well, the ideas themselves are old - but recent implementations try to solve unique problemsets.

> ... and things are yet to settle down?

Yes. IMO there would be 5-7 major projects supported by larger communities. Every of this projects will solve particular problem.

I think we can guess some of these 5-7 major specializations, for example:

- Memcachedb: Distributed K-V optimized for speed - no replication

- Distributed K-V optimized for reliability

- Distributed K-V optimized for size - like Dynamo.

- neo4j: Graph database

- redis: K-V with reach features, but limited to data size that fits in memory

- K-V framework created to allow Map-Reduce jobs - including scheduler, debugger and so on.

z800016y ago

FYI if you are into the bleeding edge, redis has a virtual memory implementation as of about 12 hours ago.

pierrefar16y ago

Because they're all different. We have key-value stores (Berkley and MemcacheDB), column stores (Cassandra), document stores (CouchDB, MongoDB), and even new data structures (Redis).

This kind of experimentation is awesome and it allows us to figure out what really works in what situations and is a sign of a very healthy community. I love being part of it.

nawroth16y ago

bradfordw16y ago

bitdiddle16y ago

CouchDB is well worth a hard look mainly because it takes advantage of several new ideas all in a very simple stack.

In a year or two I predict two or three will emerge as clear choices for a few distinct scenarios.

jokull16y ago

jdp16y ago

keefe16y ago

kezOP16y ago

Thanks for all the comments; I have put together a brief summary: http://www.justkez.com/why-are-there-so-many-nosql-options/

It's been very interesting reading the responses.

koenbok16y ago

It's a hip thing to work on, an interesting problem to solve, has some nice ideas around it, has great potential to get lots of users and there are no widely accepted solutions yet.

j / k navigate · click thread line to collapse