- SQL queries flatten structured and hierarchical relationships into unstructured rows, often requiring very creative join operations to get at the exact set of data you want, after which you then have to convert everything back to the structured data you wanted in the first place. ORM's are a leaky abstraction on top of that, which means that when you use them you end up dealing with two problems instead of one.
- SQL is superficially simple, but at scale the simplicity is a lie. Thhe performance trade-offs you make require you to understand the internals of the database you're running on and how it interprets the query, and often you feel like you're dealing with a too high level API. Often you're resigned to "tricks" to force the engine to do the right thing, like optimizer hints, special index types or denormalized data.
- SQL has no ability to express relexation of the ACID principles. If you're streaming weather data into a SQL database, it could be ok to drop inserts here and there to gain throughput for the data analysis operations. SQL has no abilities to express such a thing (except in db-specific hacks).
There are numerous ways of dealing with hierarchical or otherwise structured data using relational DBs. Support for recursive CTEs is becoming more common, which makes such querying quite clean and simple. And with many relational systems offering support for storing and manipulating XML and JSON data, storing more structured data becomes a non-issue.
I think that SQL managing to hide a lot of the underlying implementation details is one of its strongest points. In many situations the defaults are more than sufficient, saving developers and users a significant time. Yes, there are situations where more control over the exact handling is needed, but it gives ways of managing this. At least a simplified, high-level abstraction is offered; many NoSQL systems immediately force you to take a very low-level approach for even the simplest of queries.
And I don't see the ability to intentionally lose data as something that's good, either. The database should be concerned with doing everything in its power to protect and ensure the integrity of the data it has been given. If it's fine to lose some of that data, then whatever or whoever is providing that data should just not provide it to the DB in the first place.
So while some people may present those as possible disadvantages of SQL, I just don't think the arguments really hold up upon further scrutiny.
Why should the language control the underlying data store to that level? Also, SQL itself is not defined by or defines ACID principles. You can (and some people do) run SQL over very non-ACID databases (everyone using MyISAM or SQLite?)
I personally feel that tying the data-store so close to the query language is a recipe for unneeded complexity. Though, the transaction isolation levels may be close to what you're talking about (though, I don't think there is anyway to tell a db like PostgreSQL "Go ahead, lose this. I don't care about it".)
> SQL is superficially simple, but at scale the simplicity is a lie. Thhe performance trade-offs you make require you to understand the internals of the database you're running on and how it interprets the query, and often you feel like you're dealing with a too high level API. Often you're resigned to "tricks" to force the engine to do the right thing, like optimizer hints, special index types or denormalized data.
Please tell me of anything where it isn't true that the more intimately you know your tools the better you can use them. Quite honestly most query planners are really friggin' good at what they do (just like a compiler, or are you a person that still claims hand tune assembly is the answer to everything).
EDIT: PostgreSQL does allow unlogged tables that are not crash safe, though I don't think they're randomly lose rows, but are faster. So there are some semantics that allow you to control that, but they are probably vendor-specific (getting into my complexity argument). Thanks Myon on #postgresql for pointing that out.
What exactly don't you like about SQL?
* SQL not being relational algebra
* SQL implementations being incompatible
Both are real but I have not experienced either of these being a problem in real usage. Despite the problems caused by NULLs and that SQL allows duplicates I do not think the alternative would be better for real world programming. And the incompatibilities are avoided by picking one database per project and sticking with it.If you have not experienced problems as a result of these core issues, it honestly makes me feel you must not have a lot of experience -- at least not in diverse projects and environments.
That you think picking one database is some sort of viable solution in the general case amplifies that feeling. That is so often simply not an option, and even where supporting only one database at a time is an option, you absolutely cannot guarantee that you will not have to migrate later. I've been through that pain many times, it is a real-world problem.