This is bad advice. It's best to understand your problem domain and use the tools that are most appropriate. You see a lot of two types of posts on HN:
* "I picked a NoSQL database for a problem domain with a better relational fit." Those posts look like this one.
* "I picked an RDBMS for a problem domain with a better NoSQL fit." Those posts are usually titled something like "How I scaled Postgres to XYZ qps" and describe an insane amount of re-engineering and operational hell. Oddly, these posts are usually proud of the accomplishment rather than embarrassed that they picked the wrong tools in the first place.
There are upsides and downsides to RDBMSes. From my experience, you should be leaning towards NoSQL systems (of which there are many, each suited to different use cases) when you have very large scaling needs (in terms of dataset and qps), heavily polymorphic data, or data that has ambiguous structure.
It's been stated many times already: Use the right tool for the right job.
The problem with that argument is that things change, and it's not always obvious what problems you will need to solve when you start a project.
RDBMSs are backed by extensive theory as well as decades of product hardening, development of expectations and best practices, and battle-tested in production environments. They are explicitly designed as general-purpose systems. That means that when your needs change and you need to adapt, an RDBMS likely provides features that make that transition work smoothly and avoid breaking applications.
At this time, those things are not true of MongoDB, so it is somewhat more specialized. That's OK, but it means that you need to be more careful about using it than, say, postgres. Trying to put MongoDB on equal footing with postgres and say "choose the right tool" is an oversimplification.
I think it's perfectly reasonable advice to start with SQL unless you have specific reasons that you think something else will be a better fit.
It is a lot easier to start strict and organized and back down to less strictness with looser rules where you need it, than to do the opposite.
Sure. The big problem though is that the problem domain of RDBMS's is ad hoc reporting which is pretty universal.
rom my experience, you should be leaning towards NoSQL systems (of which there are many, each suited to different use cases) when you have very large scaling needs (in terms of dataset and qps), heavily polymorphic data, or data that has ambiguous structure.
On the other hand, you should choose an RDBMS if you are pretty sure reporting needs will change or you will need to accommodate new metrics on a frequent basis.
The problem is there is a huge overlap between those two. NoSQL means telling your customers "no, we can't do that new report you'd like right now." That's a huge tradeoff for any customer-facing app.
After all the basic tradeoff is between flexible input and flexible output.
There have been more than a couple times when I was ready to blame the relational model, but further investigation revealed that the real root of the problem was that the existing schema or query used an approach that was optimized for one DBMS but performed terribly on the one we were actually using.
Oracle is in a world of its own in a lot of ways, but if you are moving to or from Oracle you should have the required resources and expertise in your budget.
If you are referring to the well-known cost of COUNT(*) in PostgreSQL (or MySQL with InnoDB, for that matter), or different ways to handle full-text searching, I agree that there are differences you have to deal with, but they are not usually a big deal.
Contrast the fairly easy and routine process of migrating between MySQL, PostgreSQL, or SQL Server with the huge amount of work involved changing your application code from MongoDB to anything else.
Disagree to the books recommended. SQL is only a query language, not the database itself. It definitely should be part of the consideration. Understanding how database engines works under the hood is more important in terms of performance in high-concurrency, high load scenarios.
What this means is that optimizations appropriate for InnoDB (index everything you search on) are very poor for PostgreSQL and vice versa. In PostgreSQL you index only a few columns, typically join conditions and a few select search columns, and the combination of the operating system and database will do the rest. In short, InnoDB only does index scans (and the table is a special index). This makes some things perform better (primary key lookups), but many other things perform worse. Moreover optimizing for one does not mean optimizing for the other....
The same problem exists with OLAP databases.
People trying to do MOLAP without building ROLAP. They are thinking that MS SASS, ESSBASE, Cognos or Qlikview will do some "magic" that eliminate the need to carefully think about data
Look at the bright side: programmers who are writing NoSQL-backed apps are creating the fossil fuel that will keep programmers who know RDBMs working into our retirement years. I already have more work than I can do fixing web apps that were built around crap data management tools that failed to scale beyond a few thousand users. Your Postgres expertise will still be a money-making skill long after MongoDB is forgotten.
Basically, every evangelism point, you have to double-check and cross-reference, because as you say, the NoSQL guys are encountering issues the RDBMS community addressed years ago (7 in my example, but the sharding stuff, 20+ years) - except they think they are discovering it for the first time!
There's also the tendency to favor new shiny things and reject old crufty (but proven) things, to want to be part of what seems like the leading edge, to be that guy in the cube farm who is playing with the cool new stuff.
I have been programming longer than RDBMSs have been available, so I know from experience what it's like to manage large databases in application code, and how hard it can be to maintain consistency or do accurate queries and aggregation with half-baked tools. It's frustrating to see a new generation of programmers go through this, but it's human nature to ignore the past.
My fourteen year old son wears his pants pulled down below his waist, Vans shoes, hoodies, lots of hair. He looks pretty much like I did when I was fourteen back in the 1970s. The underlying technologies are the same: pants, shirt, shoes, sweater, hair. The only differences are superficial. To him that style is edgy and contemporary and something his parents don't get. NoSQL is the gangster fashion of programming right now.
I'm using it on one of my projects, and as I said, there are things to like about it, but graph traversals remain problematic. Overall, I'd say I'm pretty unhappy with MongoDB, and I wish I hadn't chosen it; it doesn't work well for my project. A lot of things I'm doing boil down to performing some kind of graph traversal, which is painful to do at runtime in Mongo (the potential solution space is too large to be precomputed). From what I've seen, MongoDB hasn't been working for me very well with a highly-connected data set.
Some of the biggest companies e.g. Twitter, Foursquare, Google, Amazon all rely on NoSQL.
The real issue I see is that by dismissing NoSQL as only for fools RDBMS developers are failing to see why they are popular to begin with. Take PostgreSQL for example and how difficult it is to shard/replicate compared to CouchDB or MongoDB. This is an area PostgreSQL should see as an opportunity for improvement.
But dismissing huge groups of people as uneducated just makes you seem uneducated.
I said this below, but I'll say it again. Data is malleable, and writing apps to fit around any datastore seems wrong. I write applications to fulfill their use case. When the needs of the application change, so could it's database (or other dependencies).
I've worked on big applications with big databases where DB operations and aggregations started to get too slow but so far I or someone smarter on the team has always been able to figure out a solution without throwing the RDBMS out. One big advantage of working with RDBMSs is the roads are paved and well-mapped. By contrast you still need a machete and good luck in the NoSQL world, and even then you can get sick or lost in the jungle.
I haven't personally worked on a web app with the kinds of database scaling issues I've dealt with in logistics applications involving tens of millions of movements every day.
RDBMSs were invented to allow data malleability. Go back to the old textbooks and look at what preceded the relational model. Remember that data is more valuable and permanent than application code. When you have multiple applications accessing the same database, each with their own implementation of consistency rules and their own internal idea of the data schema, come back and explain how that is more fun than using an RDBMS and being able to sleep nights.
Another area is for high-write throw-away data. If you are facebook, do you really care if every message gets through or every feed element gets propagated?
Also perhaps content management (which is a write seldom, read often field) might be good, particularly where data can be reconstructed from other sources. Maybe doing the writes first in an RDBMS and pushing out to MongoDB might be a good thing there? You could then do things like referrer tracking and log analysis back in your RDBMS.
Although hybrid databases are way cooler. One idea we are throwing around regarding LedgerSMB development right now is using JSON types in PostgreSQL 9.2 to store extended attributes, so you could have a customer account, and decide you want to store some extra data about this that we don't support right now, and have it stored in JSON.
But but but.... NoSQL is cool ;)
I've used it on small projects, and have enjoyed it. Perhaps my data has just been simple/loosely-coupled enough to never run into these problems?
I read a lot of posts like this on HN before every trying Mongo, so I've at least been convinced to always implement schema at the application layer. Others seem to keep learning that lesson in harder ways.
It's not as bad as it's made out to be. It's only if you really are looking to scale out, you should probably be better of picking something else.
The majority of the NoSQL databases are based on Amazon's Dynamo: loosely coupled replication. MongoDB is one of the few (next to Hbase and a few others) that adopts Google BigTable's architecture: data is divided in Ranges, and each mongod node serves multiple Ranges.
This means MongoDB is able to provide atomicity where it's harder with other SQL databases. In particular, we need to be able to do some sort of "compare and swap" operation that is guaranteed to be atomic/consistent, while still being able to have our mongod nodes distributed over multiple datacenters.
In Dynamo-based architectures, in order to provide the same amount of atomicity, you always end up writing to at least half + 1 the amount of replicated nodes you have available in your cluster. This is more awkward, and reduces the flexibility of the whole (the atomicity guarantee Mongo provides also works for stored javascript procedures, for example).
Having said that, we're using MongoDB about 3 years in production at this point, but we're far from happy about the availability it provides (issues like MongoDB not detecting that a node has gone down, failing to fail over, etc). We run a HA service, and to date all of our failures in uptime have been either the fault of our hosting provider or mongodb not failing over when it should. As such, we're always looking for a better alternative to move to, but at the moment MongoDB is about as good as it gets.
Just. Wow.
Get your head out of the sand mate. MongoDB is nowhere NEAR as good as it gets.
Workloads where writes are rare or you have a single writer separate from your readers, most reads are simple fetch-by-ID, and more complex queries are unpredictable and/or suitable for overnight batch runs.
Workloads where performance doesn't matter but you want schemaless for convenience.
I think you are not alone in learning this lesson with this particular technology. Fortunately it's one I learned by proxy from working adjacent to a team that decided to introduce Mongo into their stack...but I still wake up and hear the sceams at night of "You have to put the whole dataset on RAM?"...you weren't there, man...we lost a lot of good guys...
You have to draw a clean line between "stuff it is really fun and enlightening to play with" and "stuff you introduce into your stack".
I'm not suggesting everyone should be sticking with 1966 COBOL - times change, new tech comes up which makes sense to adopt. But not jumping on the shiny new tech bandwagon can have social consequences you need to be aware of.
I'm pretty new to the whole database thing, but how is MongoDB different from Postgres or Mysql in this respect? In a traditional database, the data is pulled directly from the hard drive. Why does Mongo suffer a performance hit and Mysql doesn't?
MySQL's InnoDB table engine, on the other hand, uses direct I/O (in the recommended scenario) and manages the buffer pool independently of the kernel. Its buffer pool manager is specifically designed for the typical workloads MySQL is used for (http://dev.mysql.com/doc/refman/5.5/en/innodb-buffer-pool.ht...) - as opposed to the naive LRU that most OSes employ for their filessytem buffers.
Even if you have a clean line, when do you promote something across that line? It is easy for little bits of weirdness to escape detection(Mostly thinking about Cassandra write failures here.) in a fun to play with environment especially if they only come out when you are running a cluster.
This is hardly a hidden feature...
I've been having a similar problem with an SQLite data store, only the other way around. Strings were getting converted to numbers, with leading zeros that were significant and needed to be maintained being lost along the way.
It sucked all the fun out of dynamic typing for me. At least in combination with automatic type conversions. Having to think about type and when to make transitions across type boundaries when you need to is just a little light busywork. Having to worry about type and transitions across type boundaries being made contrary to your intentions is a downright PITA and, it turns out, a serious quality control issue.
I use Mongo daily on a Go project, and I actually think it's pretty annoying; I'm not trying to be a Mongo apologist, but ... this type conversion argument doesn't seem to be very fair to Mongo.
I'm a little disappointed how any post about moving from X to Y (especially if X is Mongo) makes the top of the front page on HN. This is not really a very good or insightful post. It's one persons experience and anecdotes of the pain points of learning a new technology. Mildly interesting, but not really expository in any way.
SQLite, for better or worse, is designed to do what you had an issue with. Pick a different DB if you need strict data types. Check out section 2.0 Type Affinity.
It's the trade-off of being able to scale reads and writes horizontally. And unless you need it, an RDBMS makes sense given the flexibility.
Maybe, instead of looking at NoSQL as a full-on replacement for RDBMS, we can look at it as a better solution to sharding.
I think no-schema fits agile quite well. For rapid prototyping, I prefer Mongo to even sqlite.
In general you need to know what your doing under the hood and how either solution effects your problem domain. Where I work we need to aggregate billions of data points on demand. This can't be done in real time without pre-aggregating the results ( and even then it takes tons of I/O just to process the aggregated data set )
If you have millions of rows, you are probably better off with something like MongoDB, if you need to search that, you should probably use something like Sphinx or Lucene anyway. But if you know that you won't have too much data for the forceable future, you should use relational databases. OR you could simply use both.
OpenStreetMap has over a billion nodes stored in a PostgreSQL database.
http://www.openstreetmap.org/stats/data_stats.html
My point is that you can get very far with a classic relational database before you have to scale vertically.
The problem comes with high concurrency, in particular very high write concurrency, or with very complex queries which require a lot of RAM to do properly. But that's where you need a solid db, good hardware, and a solid DBA.
However, what he always fails to mention is that some of the most important OLTP that happens is in enterprise ERP systems, and these involve every kind of workload of all of these. Yes, you could probably get some additional performance gains by splitting up your ERP into VoltDB, Vertica, and CouchDB with connectors between them but why?
And relational data doesn't even start to get moderately big until you are in the TB range. This is true even of ERP apps.
Millions of rows are a problem? What are you using? Microsoft Access?
No one can do that. That is your problem, not MongoDB.
As far as aggregation, use the new Aggregation Framework http://docs.mongodb.org/manual/tutorial/aggregation-examples...:
db.zipcodes.aggregate
$group:
_id: "$state"
totalPop:
$sum: "$pop"
,
$match:
totalPop:
$gte: 10 * 1000 * 1000
As far as "losing the independence of your data access paths", no you don't. You are free to use linking instead of embedding wherever you want. http://www.mongodb.org/display/DOCS/Schema+Design#SchemaDesi...MongoDB doesn't have a built-in full text search? So what. Most systems with large amounts of text to search do not rely on the text search capabilities built into relational databases anyway. People use actual full-text search engines like Lucene/Solr, Sphinx, reds, etc. Having said that, if you just wanted to support lowercase keyword queries with MongoDB, would it really be so hard to extract and store lowercase keywords from your text, as suggested here? http://www.mongodb.org/display/DOCS/Full+Text+Search+in+Mong...
If you are trying to add four 1s and get '1111' instead of 4, that is an error in your application code which has nothing to do with MongoDB. Very common problem with JavaScript. If it is JavaScript, try finding the code where you are attempting numeric addition and change it so that instead of saying for example 'total += newNumber' it says 'total += (newNumber * 1)' .
Jeez, lay off the confrontational tone. He doesn't say anything about OCR. Maybe he's using humans to do data entry? In any event, it's completely irrelevant to the topic of databases.
OK, maybe he is using humans to do data entry. The home page to me implies that the process is automatic, but I guess it doesn't rule out the possibility of humans doing data entry when he says 'tag and categorize'. But if he is using humans to do data entry instead of some automatic OCR, that is still his main business problem, rather than MongoDB. The application is relevant to the database discussion, and Hacker News is about all aspects of startups.
Had it been any one the issues I've pointed out, I would be quite happy to work around them. I do that routinely with other pieces of technology that do 95% of what I want. However, the hits kept coming. At the same time, I was using Postgres on another project and the question was begging to be asked - if I am going to run extra services like Lucene and do extra work to achieve full text search, what am I getting in return? And I am sorry to say - case insensitive search is quite a basic feature.
As for the 1111 issue - I admit it can be fixed. But the fact still remains - there's no way I would face this problem with Postgres. So why not just switch?
Postgres is a really nicely thought out and well executed database. And don't get me wrong - I am not building a system designed to operate at stratospheric scale. That would be premature in my opinion.
Thanks for taking the trouble to comment.
In general, I love the ideas behind NoSQL. I can still feel the excitement when reading the BigTable and MapReduce papers. HBase, Hadoop, Radis, etc. are awesome products. I use some of them in my work. But some other NoSQL products? Being engineers, we must understand the implementation and be full aware of its limitations, instead of believing their marketing materials. Well, if all you want is to test a toy product, to build a prototype, or your product is of low concurrency and low data size and you have no concern on operation, it certainly looks that they make your development easier. But in these scenarios, any good relational databases won't add significant burden either.
And as engineers we must understand that most other engineers do take their role seriously and evaluate products on their merits.
Implying that they are falling for "marketing" just because you don't agree with their choice and then lecturing them for their choice doesn't make you come across well.
- You can make case insensitive searches on the DB using regexes (http://www.mongodb.org/display/DOCS/Advanced+Queries#Advance...). A simple case-insensitive regex is not very bad performance-wise, but in general, case-insensitive searches should be avoided for search purposes (you can normalize to set everything to lower case or other equivalent trick)
- The proper way of doing an audit (and search later) is to make an independent collection with a reference to the other(s) document in a different collection. Then you can index by user, date, or any other field and leave the main collection alone. The described embedded access collection doesn't look very scalable.
- Making map-reduce queries is tricky (at least for me). I think the guys on 10gen realizes that and the new aggregation framework is a way of dealing with this. Anyway, the main advantage of SQL is this kind of things, the rich query capabilities. Even if MongoDB allows some compared with other NoSQL DBs, if there is a lot of work in defining new queries, probably a SQL DB is the best fit, as that is where SQL excel.
I don't truly believe in this "you should research everything before starting" (I mean, I believe in research, but too many times the "you should do your homework" argument is overused. Sometimes you make a decision based in some data that changes later, or is incomplete), as there are a lot of situations where you find problems as you go, not in the first steps. But, according to the description, looks like PostgreSQL is a better match and the transition hasn't been too painful, so we can classify this into "bump in the road"/"success history". Also, probably right now the DB schema is way more known that in the beginning, which simplifies the use of a relational DB.
I think this point is very important even in the RDBMS side. There are cases, even with relational datastores that would preform better if the dataset was built to the query.
The difficulty comes into play when you are trying to keep the denormalized data up to date based on changes within the base dataset.
This is true. After all you can't extract an audit trail from a deleted record. Simply in terms of information management it often makes sense to represent this as separate info.
> "There are cases, even with relational datastores that would preform better if the dataset was built to the query."
The big problem with doing it that way is that you are screwed as soon as requirements change. I recently blogged about not-1NF designs in PostgreSQL (nested data structures for subset constraints), and the lesson I took away was that you really don't want to have your select queries hitting the same tables you maintain for inserts. You want your queries hitting a normalized data structure even if the data comes in as something different.
gregjor, gaius, and zemo
In this case it appears to mostly be those arguing for Postgre, but I wouldn't care if you were arguing for sunshine and unicorns there is a way to behave civilly and your not doing it.I have never been upvoted so much on HN. I admit to strong opinions and light sarcasm but you'll have to show me where I've been uncivil, belittling or insulting, except perhaps in response to people who insulted me.
Curious. I have never found a database named "Postgre" to be used by anybody. Perhaps you can direct me to the download site.
I think the bigger issue is that there isn't a lot of discussion from the NoSQL crowd about what you give up when you go to a NoSQL solution. I think that sort of disclosure would help people weigh the options a lot better.
From the comments of some people here you'd almost think they would build an ERP app in Mongo....
The schema-less database approach also seems attractive at first but updating your data whenever your "app schema" changes starts to become a pain real quick.
Now I can't really live w/o having a schema first, it actually saves you a lot more time in the long run (even short run), being schema-less means you can't really do anything too fancy w/ your data (generate reports, advanced search, etc...)
And you can do anything to your data without a schema, you just need to build your app as a service that provides access to it.
IME SQL schemas do more harm than good; usually you end up with a schema that's subtly weaker than what's actually valid for your application, and the difference between the two models will trip you up at the worst possible time. Have a small, distinct set of classes that you store, enforce that you store only those (and don't access the storage layer any other way), enforce that they remain backwards compatible, and enforce that you can't create invalid instances. But application code is the best place to do all these things.
Completely disagree here. The basic tradeoff is between flexible input and flexible output. Without a rigid schema, ad hoc reporting is impossible because you don't have an ability to articulate reporting criteria. I.e. no declarative schema means no declarative reporting query.
I suppose that's ok as long as you never need to report on anything..... Might work....
Naive question from somebody who has done a little reading on and dabbling with key/document-with-MapReduce style datastores, but who hasn't tackled a real production problem: I thought running queries over the entire dataset was one of the assumptions of horizontally scalable document stores? In terms of avoiding computation, you can only limit queries by document key, which even if you're clever/lucky doesn't always encode the parameters you're querying on, or doesn't encode them in the right order, so you should be prepared to run queries over your entire dataset. Hopefully the queries you run often are optimized (e.g., using indexes or clever use of key ranges), but in the general case, you have to be prepared to scan the whole shebang, and that's supposed to be okay because of horizontal scalability, right?
It's a tradeoff between the benefits of the document store vs the loss of relational data. The blog author here clearly didn't understand the trade-offs he was making.
As always with these discussions: it's important to use the right tool for the job. I'm a big fan of what Mongo is doing. I've used it two higher scale projects and have no complaints. Of course, I'm using it in the context for which Mongo excels.
It looks like this single initial assumption is where things started going wrong: conflating the pieces of paper that happen to be called "documents" in the real world with the concept of a "document" in the context of a system like MongoDB.
It seems like another instance where all the good names were taken. http://jcooney.net/post/2012/04/03/Transaction-Argument-Clas...
I think the recent popularity of MongoDB bashing is maybe a testament to MongoDBs popularity. I'd guess that because MongoDB is probably the closest NoSQL database to a RDBMS with its ad hoc queries, that it is attracting many newcomers.
I spent many years hammering away with RDBMS and by and large it was great until it wasn't. I try to look at data storage more holistically now based on best guess of the problem. I have tried to convert an application from Postgresql to MongoDB and it failed but that wasn't MongoDB's falt it was because I didn't change the data model to fit a document storage system. I have also tried to use PostgreSQL for a realtime reporting system and failed horrifically and that was not Postgesql fault it was mine. Amazing what happens when you stop pushing a chain and start pull it!
This is true. Managing complexity is always an important task. I am not sure that NoSQL solves this however. Also the best way to break things up is to loosely couple things, and this requires to some extent that you have ACID compliance. A good RDBMS, like PostgreSQL or Oracle, will provide tools for managing that loose coupling.
>"2. If the software is popular enough everyone is running to use NoSQL (cache is NoSQL)."
Like proverbial lemmings over a cliff....
>"3. Clearly it is not a good storage solution either because for example in an address book nested list greatly simplifies everything. (right tool for the job)"
Funny, I thought nesting was what WITH RECURSIVE was for....
I am not saying there aren't use cases for MongoDB or reasons to switch some applications. For example I can think of a few really cool apps, like maybe a network back-plane for a huge LDAP directory. Also content management might be a good fit. But despite your years of experience, it doesn't sound like you have really looked at how to solve these with good RDBMS's.....
Please knock this crap off.
All that said, our developers loved the ease of simple retrieval and insertion, and of course the scalability. So I guess you ultimately need to base your decisions on your priorities.
I don't fault the OP though, since it's hard to know just how limiting NoSQL will be until you try to do all the things you used to assume were database tablestakes (no pun intended).
I choose MongoDB for my last side project and while it was awesome working schema-less and developing the client facing part of the project was certainly quicker to deliver, i feel pretty lost on the analytics/BI side of it and couldn't say it better than him: "Not having JOINs makes your data an intractable lump of mud"
So coming from a relational/SQL background I found MongoDb awesome upfront, but frustrating later on... and yes I'm off to learn http://docs.mongodb.org/manual/applications/aggregation/
Just reading this blog - it's clear that MongoDB was not a good fit for him, if he had bothered to do some research, he would have found this out on day one.
Thats the real lesson he should be taking away from this and blogging about yet somehow MongoDB are trolls and it's all their fault because of a lack of features and they have bypassed 40 years of computer science and blah blah blah blah, excuses, excuses, excuses.
edit: removed a few pointless sentences :-)
It's possible the reasoning he used to use mongodb is the same as the one he used to abandon it.
There are some approaches to solve some of the author's problems that end up making the Mongo system look and feel a lot more like a SQL system because sometimes data is actually related.
The author could have also taken a different approach to his data schema that would have fit more of a non-relational worldview.
Software development and architecture is about making choices and working with and around the limitations of your tools. It doesn't matter if PostgreSQL or MongoDB are "better". It's about solving a problem using a set of tools you are comfortable with.
I have a background in data warehousing in both Oracle and SQL Server, and was part of the decision to use a polyglot persistence model. I've got at least a decade's worth of experience in the DW world, and more as a general developer before that, so I like to think I've got a relatively credible background in a variety of data stores.
I haven't looked at Mongo much - it's durability concerns and the write lock stuff pushed me away from it early on (I don't mean to disparage it, but that was where it was at when I evaluated it), but Cassandra's configurable consistency levels and operational story at a cluster level are what sold us for our time-series data (that, and the ability to construct a sparse timeline and multiplex reads/writes). For anything we need flexible querying with, we push it into specialized Postgres dbs.
The level of willful ignorance and vitrol in this thread is kind of amazing. Most of the really experienced DW guys I know are all looking at HBase, Cassandra and others because they fit a niche that we've all been looking for in certain data sets at really large scale. It doesn't mean we're ditching our relational data stores, it just means we're augmenting them with other tools because they fit the job at hand. To suggest that one tool is absolutely perfect for every scenarios seems a little short-sighted to me, possibly driven out of inexperience. I don't mean that as an insult - I know a lot of guys who've been working on the same data sets for 30 years who really do just need the one tool - however, you've got to realize there are other data sets and problems for which your hammer just won't fit.
I doubt that, unless it's extremely simple, any set of requirements are an exact match to only one of these technologies... mix and match is the future :P
My primary objective is that my application fulfills it's use case. Data is malleable, so you should use the right tools for your needs.
That being said, it sounds like the OP was trying to use a chisel as a replacement for a toolbox. Basically fighting the software (mongodb) to fit his requirements, instead of using additional tools.
Tools are enablers and supposed to make ordinary people rock star. If it takes a rock star to use a tool, the tool fails.
Also, a thorough understanding of MongoDB indexing, advanced queries and schema design would have squashed all of these issues. Has anybody had a more pleasant experience with a MongoDB ODM?
For accounting type problems use a relational database. For document driven items - e.g. a resume database - nosql works great. For a hybrid pick your battles... or use both.