Robert Haas, a Postgres committer , occasionally blogs about comparisons : http://rhaas.blogspot.com/search/label/mysql
Theres also http://www.wikivs.com/wiki/MySQL_vs_PostgreSQL
Historically , MySQL has been more widely available on low end web hosting plans, so its what a lot of people first use when they start using databases, and a lot of web apps, such as Wordpress, support it exclusively.
Until a year or two ago, only MySQL had built in (if occasionally fragile) replication which made it popular for that reason alone. Postgres now has robust replication, with new features coming down the pipeline soon : http://www.depesz.com/2011/07/26/waiting-for-9-2-cascading-s....
I prefer Postgres, but oddly enough under Oracle theres been some interesting features added to MySQL, which is good for both.
Also, MySQL got commercial entity behind it in its early days which promoted it a lot. In addition, it worked an all platforms, including windows, while Postgres was there just in last couple of years.
I cringe every time I read that. MySQL replication is many things, but it is not reliable (as anyone who has used it at scale will confirm).
I think the only reason this myth prevails is because hardly anyone ever actually verifies if their master/slave are in sync. A table checksum can be a real eye-opener here, especially on a deployment that's been running for while and undergone schema changes, restarts, network splits, etc.
PostgreSQL's built-in replication is pretty easy to set up[1] and provides a writable master, and a cascade of slaves. Slaves can be synchronous or asynchronous, and the synchronicity can be turned off per transaction.
[1] http://www.depesz.com/2011/01/24/waiting-for-9-1-pg_baseback...
MySQL is good enough so there is no need to migrate applications for a few additional features.
In the Windows world the same is true of SQL Server -- the setup, connectivity, and basic usage is so incredibly easy that it made it the first choice of many teams.
This seems incredible -- that products are chosen on such an irrelevant-in-the-long-term basis -- however it has proven true across almost all of the computing market, even targeting highly skilled developers. PHP has few competitive merits, yet it was the default option for many because it was so easy to make something basic in.
There's a lesson there in that.
I don't think there is anything too incredible in that. If you want to throw together an idea quickly, get it out there and test response then use whatever technology gets the job done quickest. You can always change later.
Why waste huge amounts of time setting up a technically perfect database for a product it turns out no-one wants?
"But we’re relatively satisfied with sharded MySQL storage for Evernote user account metadata, even though that’s not going to win any style points from the cool kids."
Indeed, hipsters beware!
In this model, it's the application's job to anticipate and work around these inconsistencies. Assertions that check for data integrity, if you have them, have to be modified to work around this. For a lot of the web applications we talk about these days, who cares - it's people's lists of friends and TODO notes. It's simple data and some dangling records aren't going to hurt anyone.
In the SQL world, we instead write migration scripts that migrate the structure and data all at once. This is a little more work up front, but as long as you stuck to a mostly normalized form and use great tools (which we now have) this is not a big deal ("MONTHS" to migrate across slaves sounds like they had some very wrong decisions made earlier on). The application can move straight to supporting the new structure and entirely forget that the old one existed. In this world, we can also have really complex relationships between fields, like when we're storing accounting or medical data linked to temporal records of changes. The application can consume this structure without worrying about consistency.
I think the problem is that MySQL isn't really a standard SQL-world db. It has some of the advantages of one, but not all of them and some very annoying gotchas.
Not always. Changing the schema can break an application, in particular when the database supports multiple applications.
> It's simple data and some dangling records aren't going to hurt anyone.
It depends on the application. In healthcare people literally die due to some dangling inconsistent records.
The problem in anticipating and working around these inconsistencies is that the workaround is added ad-hoc in code, rather than through a model defining the data change. You need a model http://chronicdb.com/preparing_schema_changes
Migrations are programs, but your existing live application is also a program. Changing the schema breaks your live application to prepare for the new version of the application.
http://chronicdb.com/blogs/nosql_is_technologically_inferior....
Flexibility of schema definition and flexibility of schema change are two different things. Defining schemas only involves data. But changing schemas involves not just data, but code too.
That isn't a reason to choose NoSQL. That's a reason to-
-doubt MySQL -- many of the purported downsides of SQL solutions (for instance the ridiculous "avoid joins" meme that has zero bearing on any good database product) are actually MySQL problems. Or rather, they were -- the product has made some pretty incredible strides.
-understand and embrace normalization. Most of the "we keep trying to change our schema and SQL is just so restrictive" stories could often be described as "the problems with denormalization".
Does it mean they have a database per user? That can't be right is it?
This is really useful for things like sharding, where you can split a database table onto more than one machine, because there will be few queries that will stall fetching data from one machine to another.
See, they haven't run into problems with their setup, as per, MySql 'just works' for them.
What would be interesting and educational (for me anyway) would be a situation where folks that ran into serious problems with their SQL setup despite doing the 'right things' persevered where conventional knowledge would have them switch to a NoSql solution.
tldr; Dog bites man article, would love to hear from someone that actually struggled with a SQL solution and soldiered on.
Storing an entire notebook in a single document would be the most obvious. I use postgres all the time and sql is great, but poo-pooing nosql because it wouldn't work with your relational structure is not the best idea. Also - I have found a hybrid between nosql (mongodb) and sql (postgres) is ideal - who says you need to use a single database?
Traverse all notebook documents and look at each notes date? Good luck with that.
The cost of course being that a change to any note in a notebook yields a save of the entire document. Not a problem in simple cases, but that sort of mass-write-amplification can kill you (talk to Digg about that).
Also - I have found a hybrid between nosql (mongodb) and sql (postgres) is ideal - who says you need to use a single database?
Simplicity. Coherency. Maintainability. And on. Sure, it might make sense, but if you already have you toes in the "SQL" world, it is usually worthwhile to dunk your whole foot in. Many SQL products also offer the document functionality of MongoDB, for instance. SQL Server, as an example, lets you store XML documents to your hearts content, which you can index and intelligently query upon, etc. Your scheme is boundless, and on and on.
Maybe it's the 30 year old in me showing, but I'm sticking with the 'it just works' crowd. Until some other approach provides a staggeringly overwhelming reason to switch. I find scaling up with MySQL to be ridiculously easy, allowing me to focus my time elsewhere. Ram, bandwidth, and fast storage have gotten substantially cheaper in the last few years, making it that much easier and cost effective to throw hardware at scaling up. For 99.9% of the Web, those hardware resources are expanding in value much faster than traffic is increasing.
(It's understood other developers find it just as easy to take a different approach)
* The occasional join.
* Complex queries.
* Reporting.
* Schema changes.
* Transactions, with options for trading off strictness and performance.
* Performance problems that are easily addressed using stored procedures.
* Performance tuning with minimal code changes (e.g. adding an index to change the execution plan of a complex query).
* Enforcement of invariants regardless of application code.
Why would you give these up for scalability problems you won't have, with 99.999999% probability?
a) not all code written is for web apps
b) that said recent technology also only fits a niche for modern web apps?
I think you view on what constitutes modern web apps or software in general is biased in favor of some examples who benefit by using mongo/couch/redis whatever. (And are very vocal about it)
[EDIT: spelling, and to add that I have tremendous joy in playing with couch, redis, node lately. But at the same time I try to stay critical: what are the real trade offs when I employ these tools?]
Hmm, let's see. What does a "modern web app" do? Well there is a screen with fields for users to enter something, and there is another screen where things that (other) users have previously entered formatted nicely for display. Behind the scenes, this data is exchanged between machines that do some other processing on it.
They were doing this in the '60s.
Sometimes you just pick the simplest thing that works and that can be a key value store, data structures in memory, or a lucrene interface. It's a better solution and less fragile than trying to force everything into a relational model.
So something I did for fun turned out to be the simplest thing which worked. Only I didn't know that when I did it. I think developers who don't occasionally pick something new for fun on a side project are doomed not to know when that new thing could actually be applicable on a project that matters.
Also, the example itself is very weak as bitdiffusion below pointed out.
People sometimes conflate the DB access approach (document vs relational) with the storage approach (transactional vs warehouse). This may be because the NoSQL poster child, MongoDB, at one point defaulted to a non ACID mode of operation. But you can have a relational DB that's not ACID (MySQL 3) and an object DB that is (Couch).
I was surprised so much of the original article focused on ACID as though it were the biggest selling point for an RDBMS. It seems like the biggest win (right now) is the sheer number of things a typical RDMBS does for you -- not just ACID but also data integrity (foreign key constraints), automatic index creation (mostly), and automated schema changes across many records (ALTER TABLE). The cost, of course, is the up-front effort of fitting your data and app to the relational model.