"Crucially, because open-source software tends to be written by people who care deeply about its quality (often because they have a direct personal stake in ensuring that the software works as well as possible), it is often of the very highest standard (PostgreSQL, Linux, MySQL, XBMC, Hadoop, Android, VLC, Neo4JS, Redis, 7Zip, FreeBSD, golang, PHP, Python, R, Nginx, Apache, node.js, Chrome, Firefox...). On the other hand, commercial software is often designed by committee, written in cube farms and developed without proper guidance or inspiration (Microsoft BOB, RealPlayer, Internet Explorer 6, iOS Maps, Lotus Notes, Windows ME, Windows Vista, QuickTime, SharePoint..."
I frequently rank Microsoft's product line as follows:
1 - Natural keyboard - the one there is no reasonable replacement for (the kinesis keyboards are 5 times as expensive). If there is a reason my none of my machines is 100% Microsoft free, this is it.
2 - their mice - simple, precise, comfortable, inexpensive
3 - SQL Server - it almost makes having a Windows server around worth the pain.
Having said that, I have enormous respect for PostgreSQL. It's a very solid RDBMS and it's my database of choice most of the time. It is more comfortable to use from a command line than SQL Server will ever be (which is OK, because a CLI is not a high priority over there).
The administration GUI included with SQL Server (called SQL Server Management Studio/SSMS) is very robust and whenever I look, I always see other people asking around for something as good as SSMS for other RDBMS. If you use Unix you might prefer your command line tools, but a lot folks would rather obviously use a GUI.
For developers, I haven't found anything better than SQL Server Data Tools (SSDT) which is a set of extensions for Visual Studio that let you design and develop a SQL Server database. It creates version migration scripts, diffs database schemas and data, integrates with git and other source control providers and generates DML scripts among other things.
I stopped reading after the very first bullet, and just skimmed through most of the titles.
First, you can import/export csv's. Hell all you need to do to export, is right click the view and hit export, then select .csv.
Second, you don't want such a simple installation, MSSQL has several options/packages/configurations. If you don't need it you turn it off, otherwise you will be charged a fee.
That being said, if you have chocolatey installed in powershell. It's also a one line command, and you can configure your own custom installation command.
choco install MsSqlServer2012Express
Third, I don't feel like correcting the rest of your mistakes.
I'm not shitting on Postgres, it's pretty awesome, but out of the box MSSQL is better. Especially when you have hundreds of severs running it.
I don't know about most of the other stuff, honestly. I've never felt the need to write scripting code into database queries or write my own aggregate functions, so I have no idea what would be better at that. I'll take his word that Postgres is better, and grab it if I ever find myself having to do that.
Are you saying there is no programatic interface for exporting csv? You must rely on the administration tool to generate the csv for you???
For example:
bcp "SELECT blah... FROM blah" queryout C:\..\blah.csv -c -t, -T –S<servername>
When exporting data to CSV where a text field contains the delimiter, PostgreSQL will do the correct thing every time: wrap that field in quotes. MS SQL...will not.
If you're interested in databases but aren't familiar with Postgres, it's a good read if you skip the criticisms of MSSQL. Some of those criticisms are spot-on, some are iffy, and some are rah-rah-yay-yay-open-source exhortations that seem to be lifted from Slashdot threads fifteen years ago.
(For whatever it's worth, I love MSSQL. Been using it for well over a decade. That said, I certainly don't mind reading criticisms of it.)
But yeah, an awesome resource for people dipping their toes in the Postgres pool.
There is the appearance of facts, but mostly it is bias by omission. Take for example the CSV complaints..."MS SQL Server can neither import nor export CSV. Most people don't believe me when I tell them this". This is just false, patently and obviously false. BCP and SSIS, both part of SQL Server have excellent support for importing and exporting flat files. I did keyword searches through the whole post, no instances of BCP, SSIS or Integration Services (the long term for SSIS).
I am sure others will break this thing apart section for section but I wish people would be fair and open about stuff like this - I mean, seriously if you are going to put the time and effort into a long document like this, don't be so blatantly bias.
57,4.3209,"green","He calls himself""Waldo""",98,"Y"
This is a perfectly valid CSV file, but SSIS will choke. I ended up writing a custom script component to load these files anyways.
That said, I absolutely love SQL Server, and I think this guy is a little over the top in his religious hatred of it. In an Enterprise setting, it works really, really well.
Excel in particular drives me up the will. Plus the automatic inferences in Excel break all kind of things. Have a six digit number? Randomly a date. Have a UPC? Going to corrupt it in numerous ways (leading zero stripping, converting to scientific notation, etc).
I can definitely sympathize with anyone who criticises Microsoft's CSV support. They really just want you to use an Excel format (either the binary format or the zip-XML one). However Excel formats are overly complicated for the type of data migration work CSV is often utilised for (XML is too large, it adds up, and requires custom code to import/export (even if that is XSL from CSV), JSON might be better but isn't "yet" popular).
I wish that line-item JSON were more popular, it's more expressive than csv, the tooling (programatic) seems to be a bit better, and much less verbose than equivalent XML, which I always had a problem with, as mapping XML to/from an object model is always more painful by comparison.
line-item utf-8 json+gz work really well for import/export data...
There's no such thing. CSV is a convention that people mostly agree on but there's no written CSV standard and no CSV standards body.
It's also totally counterproductive. If you tell me to use something and you start telling me things that aren't true I'm going to be deeply skeptical of your possibly-valid points if I know you're either ignorant or dishonest in some cases.
a) Security is critical. It routinely trumps common sense and evidence. And telling people "not to fall for it" is advice that will get you nowhere.
b) Vendor support is critical. And no your local mom+pop consulting shop doesn't count. PostgreSQL could really do with solid, global companies like Microsoft, Datastax, Mongo etc who offer training and proper SLAs.
I'm honestly curious. I just don't know how SLAs work, and whom can you blame if you have problems with MS SQL.
SQL Server itself doesn't ship with an SLA.
You'd get that from your service provider or systems integrator, because the customer chooses her own service levels (determined by business continuity objectives). Her SP or SI (or in-house IT department) would then deploy SQL Server in a way intended to meet the required service level. Specifically, the customer might opt for five nines (99.999%) availability. Such a deployment doesn't look anything like one that must only be available from 8am to 5pm on weekdays - even though both SLAs are met using the same SQL Server code base from Microsoft (but on vastly different hardware and network configurations).
The availability associated with an SLA usually goes hand-in-hand with disaster recovery (recovery time objective, and recovery point objective[1]), but can also apply to support turn-around (a support request is triaged, and based on severity is resolved within the amount of time specified for that severity by the SLA). As mentioned, if I'm not mistaken Microsoft has fixed service levels for that. IIRC an MSDN subscription gets you a small number of free requests. An Enterprise Agreement gets you a whole bunch more.
There are many other quality objectives you can specify with an SLA, including efficiency (capacity), integrity (security), and robustness (stability) [2].
So unless you're hosted by Azure or have an Enterprise Agreement, Microsoft is rather unlikely to provide you with an SLA. And even then you'd be the one telling them what your service level requirement is. If Microsoft can't deliver on your service level requirements, you'd do it yourself, or get a systems integrator to do it for you.
[1] http://www.druva.com/blog/understanding-rpo-and-rto/
[2] https://www.wittenburg.co.uk/Entry.aspx?id=d8c54975-bd0a-410...
Bringing up the "no warranty" clause is a total straw man argument. Almost every software license (whether proprietary or open source) has a no warranty clause.
As a programmer, I would love to use PostgreSQL everywhere over all the options you mention... Mongo has a pretty decent replication/failover setup, better than MS-SQL imho, and MS-SQL being a close second... administering PostgreSQL for high availability seems like an exercise in frustration, with lots of after-thought bolted on solutions.
I think by the time we see either 9.6 or so, or 10.0 (depending on how versioning goes), with PL/v8 and sane replication/failover solution in the box, it'll become my db of choice.
Most of what I use MongoDB for today can be done almost as nicely in PostgreSQL, and the latter can do a lot the former would trip on.
1) SQL Server Data Tools for complex schema management.
2) Graphical execution plans.
3) It's the database best supported by Entity Framework. If you know what IEnumerable is, and you know what IQueryable is, and you know what a leaky abstraction is, you can use EF to quickly start getting data in and out of SQL Server with very reasonable performance.
http://www.postgresql.org/docs/9.3/interactive/auto-explain....
External tools aren't really in the scope of this discussion, but while SSMS does a pretty good job of visualizing execution plans SQL Sentry Plan Explorer does a better job (a more compact view with a color gradient for the most expensive operators) and makes my job even easier.
http://www.sqlsentry.com/products/plan-explorer/sql-server-q...
> "but MS SQL Server does have a bizarre failure mode which I have witnessed more than once: its transaction logs become enormous and prevent the database from working. In theory the logs can be truncated or deleted but the documentation is full of dire warnings against such action."
This belies what appears to be a fundamental ignorance of SQL Server and is not at all bizarre. If a database has been deployed in a non-simple recovery model, then the transaction log needs to be truncated as part of a backup procedure. If this isn't the case then you should have hired a/better DBA.
The emphasised part is incorrect, Postgres's text storage has a 1GB limit (that stands for CHAR, VARCHAR and TEXT which all use the same underlying mechanism): http://www.postgresql.org/docs/9.4/static/datatype-character...
> In any case, the longest possible character string that can be stored is about 1 GB.
Although note that this is 1GB, you have to take in account both multibyte characters and compression (which by default is enabled on text values of more than 2k, using LZ)
You can use the filesystem but then you have to manage data migration/sync by "hand" which is kind of a bitch for distributed systems.
Honestly storing large blobs with meta-data telling you about the blobs is extremely common. I'm yet to hear of how you're meant to do it correctly.
If you are using gigabyte-long text strings in a relational database, you are doing it very wrong.
I would have agreed if he hadn't explicitly mentioned the 2GB limit of MSSQL's TEXT and NTEXT in the very next paragraph.
A) Indexes disappeared without rhyme or reason. (This might have been a "too many cooks in the kitchen" thing, where someone restored from a backup and didn't tell anyone.)
B) In the default configuration provided by the php5-mssql package in Ubuntu, text fields were chopped at 255 characters.
C) Our Windows Server did updates, then came back up with a "you need to renew your license" lock that prevented any I/O and cost us half a day of productivity (on top of the licensing costs).
And of course, D) it screwed up the file encoding when we did the final export during the "migrate to PostgreSQL" step.
Incidentally, migrating our platform to PostgreSQL and kicking the tires was one of the last projects I completed at that job, and everyone was much happier for it.
Now, you say that you possibly had too many cooks in the kitchen. So perhaps your complaints are due to the other cooks, or maybe some of your complaints came to you second-hand. But here is my analysis of your complaints, having run SQL Server for a number of years:
A) Just doesn't happen. Index-related statements are transactional and ACID-compliant. (i.e.: you can, in a transaction, add, drop, and update triggers, indexes, entire tables, etc. along with data.) This has been the case since at least I've started using SQL Server with the 2005 release.
B) I don't know why this is the case with php5-mssql, but I would blame the authors of the package. varchar, nvarchar, text, etc fields all support > 255 characters.
C) I don't know of any version of SQL Server or Windows Server that does this. No Windows Server version "locks IO" on a license check failure (this would obviously cause huge problems with filesystem consistency). As far as I know, no SQL Server version does live license checking or phones home. I have no idea what your symptom was, but I suspect someone else broke something. A running SQL Server installation does not "stop", nor does it need the license to be renewed at any point. Cumulative and service pack updates don't do license checks either, as far as I know. In fact, an expired license doesn't affect anything, and probably even reflects a misunderstanding of how licensing works. If you purchased a license, even if you paid yearly for 2-3 years, your SQL Server license is good forever. Forever! New versions require software assurance, but you have to choose to install those manually, it's opt-in, not opt-out. Automatic updates should never break SQL Server.
D) Is it at all possible this was the migrate to PostgreSQL tool's fault? Because I've run weekly backups and restores of SQL Server databases for the past several versions for many years, and had no problem with a .bak file being corrupt.
I appreciate the feedback though. This was my only time ever using MS SQL.
(Re: the 255 char thing, I had to edit the /etc/tsql/tsql.conf to change the sybase version to make the problem go away. Not a MSSQL problem, but definitely a problem with using it from a PHP + nginx on Linux env.)
b) i really wouldn't recommend hitting mssql from anything other than the home grown system.data.sqlclient built into .net.
c) there are best practices on when and how to install updates. this is not it.
d) that is totally a pain in the butt and something msft should probably figure out how to have a reasonable option for.
one, very legitimate gripe i have for mssql is the default lock levels on reads. not everything needs to be fully synchronized and having to litter queries with (nolock) feels like bad design.
one, very legitimate gripe i have for pg; no stored procedures. functions are not stored procedures. also multiple result sets for a single query, but i digress.
B) What? That's not SQL Server's fault, it's either a broken third party package or the display settings on your UI chopped the data to fit on a screen or something.
C) Never heard about this. I even tried to google it but I can't find any information about this. Do you have a source?
D) Third party tool problem probably. Not SQL Server's fault. Did the BAK file restore normally using SQL Server?
Also, IMHO one of the single biggest shortcomings from PostgreSQL is that there's still no baked in solution for replication with any kind of either multi-master or hot failover. If PostgreSQL supported this (in the box) as well as MongoDB was, along with having PL/v8 support easier to install, it'd be my db of choice.
MS-SQL does have a lot of shortcomings compared to PostgreSQL from a developer perspective, but from being able to easily install and administrate one over the other, I think MS-SQL has a significant lead here.
As long as replication/sharding are bolt-on solutions for PostgreSQL, it really isn't an option for a lot of projects.
Someday soon Postgres will have materialised views that auto update :)
However, as someone who works with MSSQL daily they're still not comparable. I'd say Postgres is much nicer in development and I'd use it for my projects, but MSSQL has much more scalability without developer intervention.
I do gather that a lot of DB admins that aren't very political about it do in fact respect MS SQL for some of its more competitive features. I'm just wondering if there's any reason to start with it for a low-cost startup scenario that could ultimately result in a non-trivial but fairly straightforward DB schema with potential for being used by very large institutions (universities at the largest I'd imagine).
I'm ignorant enough (primary experience is in web UI) that I'm leaning towards postgresql because I like the way the Django guys think and they seem to dig it. Also the no-nonsense license and yes, the not-profit-motivated thing is nice when backed by a strong core group which I gather postgres has.
But do the enterprise DB solutions handle severely massive amounts of scale better for some reason? Or is it more that they're culture-friendly to the sorts of companies that typically handle DBs of this nature?
Having been through many, many installs of Oracle and MS-SQL in my career, the speed at which you can get a PostgreSQL server up and running with data is simply awesome.
Not having an in the box solution for this is the single biggest thing keeping me off of PostgreSQL...
Yes, there are things that PG does better than MSSQL. Yes, there are things that MSSQL does better than PG. It sucks that we still have to debate it based on a smear piece from someone with an agenda.
1. HA, via replication, clustering, etc. are all fantastic as well as easy and trivial to setup. Once PG can offer a simple little wizard and setup replication, or shared-nothing clustering with automatic fail over, awesome. Or even tx log shipping with a few clicks.
Instead, last I tried (9.0 I think), pg drops you off with some weird system that makes you run shell as a specific user, where you run generic sounding commands. Oh and some default config that seemingly has some bad defaults you probably should change.
2. Overall, MSSQL makes it easy to run a DB. I was doing a billion transactions a day (each which wrote to a few tables and also included a real ACID balance update), and I didn't need a full time DBA or have to have particularly awesome experience beforehand. With pg, I sorta got somewhere, but I had little confidence
3. Development is far superior on MSSQL. Supporting other languages is a null point, because they aren't running as part of the query execution engine. That is, there's no real difference in using Python inside PG versus an external client, as far as I could tell. You still had to submit queries and make a transition. So TSQL seemed far nicer to work with than plpgsql. Although, the record types in pg were much nicer I'll admit. Another annoyance: pg didn't offer multiple returnsets in many situations. This made it awkward to run a sub function that needed to return results from different tables as separate queries. And the perf tools and UI was just so, so much better.
4. There'd be strange "little" features you'd just expect to be there, like materialized views, which pg simply does not have. Pg still lacks materialized views. The current implementation is essentially pointless, as it doesn't update the view automatically.
I'm very much for PG, and believe it's an important project and am trying to use it for future development. Much because Microsoft went back on its word that it wouldn't move to an Oracle-style licensing where you pay for CPU power instead of just sockets. They've also made questionable decisions with Enterprise vs Standard, putting extra cumbersome limitations. They aren't adding features like JSON or arrays quickly. And also, I think it's important to run open source and try to make sure free systems stay viable.
But MSSQL has a lot going for it, and the ease of use and built in HA options don't even seem like goals for PG. If licensing weren't in the way, and I just wanted an easy system that was capable but didn't require lots of time, MSSQL makes a strong choice.
We use both pg and mssql, I was surprised the article didn't mention replication. While one might be justified in describing pg as more "developer friendly", mssql is more "enterprise friendly". As a developer, I prefer pg, but as a stack designer, I still lean more toward mssql. Put another way, if I'm joining a project as a developer, I hope they're using pg, but if I'm consulting and designing a stack for a company I'll probably recommend mssql, almost definitely if they're doing something that requires replication, and/or the team isn't going to have lots of db expertise.
I guess this is quite self-contradictory on one level but I think it's a fairly common view (having checked with some colleagues).
I know that pg has support for replication, but it certainly isn't as easy to set up as mssql, and you need to do a LOT of reading to figure out what flavor you should use, and it's not entirely obvious what features you get with each, or which ones are really recommended and which are semi-deprecated.
CTO: Do you support replication? mssql: yes, and it's pretty easy to set up. pg: well, we've got 3 or 4 ways of doing that, depending on what you want exactly, method A isn't really maintained any more, and method D looks like it's going to be great in a release or two, so maybe that's the one to bet on, but in the meantime you have to put up with a bit of pain.
(dramatisation, probably inaccurate paraphrase, but that's what the cto heard)
I keep wanting to switch to PG every few months because it's open source, but keep being stumped by the lack of good tooling. pgAdmin is a joke, rarely have I ever used a more unintuitive piece of software.
For #2, there are plenty of managed PG services (AWS RDS, Heroku, etc) if you have trouble with the sysadmin.
For #4, Postgres has materialized view support in 9.3 with some improvements in 9.4.
https://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_...
https://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_...
Microsoft® WinFX™ Software Development Kit for Microsoft® Pre-Release Windows Operating System Code-Named "Longhorn", Beta 1 Web Setuphttp://www.postgresql.org/message-id/CAM3SWZRP0c3g6+aJ=YYDGY...
http://sqlperformance.com/2013/02/t-sql-queries/another-merg...