SQL vs. NoSQL (opens in new tab)

(sql.learncodethehardway.org)

55 pointsthinkerer12y ago77 comments

Theres a lot of debate on this and I understand theres no clear answers. Just wanted to hear some of your experiences working with both and whats applicable for what type of systems. I myself am working on a database for a web app analytics tool and witnessed some benefits of NoSQL while SQL is supposedly more reliable.

Understand companies like Amazon have shifted to NoSQL to manage their data. In short, I do think that SQL is for high processing, low disk space while NoSQL is vice versa in layman terms. Im pretty new to programming and appreciate a healthy discussion on this.

Please do not draw lines and have hate posts and all. Keep the discussion clean and intelligent.

Thanks!!

SQL vs. NoSQL

(sql.learncodethehardway.org)

55 pointsthinkerer12y ago77 comments

Please do not draw lines and have hate posts and all. Keep the discussion clean and intelligent.

Thanks!!

77 comments

42 comments · 8 top-level

twic12y ago· 21 in thread

The problem with SQL is it seems everyone hates its guts. It is a weird obtuse kind of "non-language" that most programmers can't stand.

I'm baffled. I'm a programmer and i rather like SQL. Most programmers i know use SQL quite happily. Indeed, i often hear them wishing they could just write SQL to solve some problem rather than use some API that purports to save you from writing SQL.

I certainly prefer SQL to MongoDB's query language, which is verbose and unexpressive by comparison.

Am i in a tiny minority? Or is this SQL hatred a popular myth?

pessimizer12y ago

It's not a popular myth, it's just become a trend. People want to use javascript or some dynamically-typed interpreted algol-like language for everything, and they're freaked out by SQL because it's both different and mathy.

It's also very, very easy, and almost completely fat-free as a mathematical expression of what you might want to do with data. I learned almost everything I needed to know about it within an afternoon ~10 years ago, and the dividends paid since on that investment have been extraordinary.

The use of "non-language" here just reflects the extremely narrow view of how languages must look amongst programmers of late.

randallsquared12y ago

When I see people complain about SQL, "mathy" never seems to come up. Mostly it's "tries too hard to be English".

1 more reply

crazygringo12y ago

I know SQL inside and out, and I hate it for anything put trivial queries that involve a single table and a single WHERE clause.

Every time I write a query of reasonable complexity, I have to figure out what I want the database to do (in an imperative sense -- read from this table first, then look up this in that table, then do this other thing with this other table for each row), and then figure out how to translate this into a declarative SQL statement, just so the database's query interpreter can turn that back into an imperative program it can run.

The whole thing is just a huge unnecessary round-trip of thinking, and SQL is limited in that there are many simple things it can't do, and you never really have any guarantee that the database's query analyzer will execute your query imperatively in the manner you hope for (which can lead to terrible performance problems, and hacks like index "hints" still don't always solve it). I'd MUCH rather be able to program against the database imperatively, directly.

So despite using SQL day-in, day-out, I totally agree that it is weird, obtuse, and a kind of "non-language" (in the sense that any query is limited to a single declarative statement, with a host of limitations, instead of an imperative programming language that allows you to write whole programs of execution).

nwenzel12y ago

Good points all the way around. Do you think SQL's obtuseness comes from it's ability to do more than (perhaps) it should?

Maybe it really only should be used for simple queries with simple where clauses. But because it can be used for more, we choose to do so.

It's based on relational algebra making it fairly predictable (query optimizer aside which is completely unpredictable). It operates on relations and returns relations.

"Select * from MyRelation;" returns a relation. So we could also write "select * from (select * from MyRelation);" and get the same result. You can continue that nested select as many times as you like and with as much complexity as you like and it would still work. Though undoubtedly it gets difficult to follow.

Adding to the obtuseness, SQL is not typed. So for "select * from MyRelation where SomeField = 1;" you could replace the 1 with another select statement, which would return a relation. But as long as that relation had a single column and a single row, it would be interpreted as a single value and successfully replace the 1.

Also adding to the obtuseness as you get more complex is that it variable scope for table aliases can be difficult to follow. Table alias names can be used across various select, where, from and other clauses to link them together through a common filter but it happens in the single SQL statement.

I think it's that common issue of reducing a "program" into a so gel statement that causes the issues I run into. At times, I feel that I am writing what basically amounts to a stored procedure in a single statement. That's when I start getting mad at SQL. Though, it's really my own choices that made it a problem.

sanderjd12y ago

This seems to be a good example of not allowing a fundamentally different paradigm to "sink in" but rather attempting to impose a paradigm you're already comfortable with onto it. Relational databases really aren't imperative systems, and while SQL is a pretty good declarative language for relational systems, it is indeed very frustrating to attempt to use it imperatively.

joesmo12y ago

I think the sad proliferation of horrific ORM systems and the thousands of hours put into them by many people is proof enough that this is not just a popular myth. While I wouldn't say that most programmers hate it, I'd say there is a fair share of those who do. Then again, considering the horrors that are built to avoid SQL, one wonders if they should be counted as programmers anyway.

jimktrains212y ago

Just because I hate extracting the results of a query into a structure the program can use, doesn't mean I don't like SQL. It's a very nice language to write relational algebra in.

1 more reply

Pxtl12y ago

I like relational algebra. I like the idea of a data-store language whose core features are the CRUD operations. But I'll pipe up and say I don't like SQL. Verbose string management, monstrous 3-value boolean logic, the lack of any linguistic shortcuts for joining on foreign-key constraints, etc.. The problem is that a good replacement for SQL would be a modernized relational language, not a NOSQL language that throws the baby out with the bath-water.

sanderjd12y ago

Yeah the SQL vs. NoSQL thing conflates so many different concerns. Sometimes it's about not wanting to do things relationally, sometimes it's about trading off ACID for other concerns, and sometimes (most rarely, it seems to me) it is actually about the SQL language itself. I don't think the language is really so bad (inconsistencies across databases aside), but I'd be interested in taking a look at a redesigned language that does all the same stuff but cleans a few things up.

jimktrains212y ago

You mean shorter than "JOIN fktable USING (key_name)"?

(Granted, that implies both columns have the same name iirc)

1 more reply

bitL12y ago

I don't like SQL either. Disclaimer: I wrote my own SQL server some time ago, then went on to help to prepare a NoSQL architecture at one of the largest tech companies.

SQL is a weird hack of relational algebra with a huge variation of syntax between different providers. Also, it's pretty inflexible if you need to change schema, you need to solve crazy transactional problems if you write your own server, not mentioning the problems arising from clustering.

Try modeling trees in SQL - if your data is mostly static, you will be saved by nested sets or their fractal successor, however most SQL servers mislead you in thinking their tree-functions are anywhere as performing. If you think you don't need trees, imagine modeling categories or an org chart in SQL, and making sure decisions at one level propagate correctly to all levels underneath and can be overriden only with correct permissions, and this must be a part of each query.

marcosdumay12y ago

I really like SQL, because all those hacks crazy transactional problems are encapsulated within it, and I don't have to deal with them in application code.

But yes, writing a noSQL server should be way easier than a SQL one. And yes, you have a point about trees, as do people complaining about queries results having no structure. Thigs could be better, but schema enforcement and data security in face of massive parallelism are quite worth it.

noblethrasher12y ago

Don't CTEs solve the problem of representing trees in a DB (modulo depth limit)? I have used them to construct schemas that represented org charts and other kind of trees. Right now I'm working on a project that will probably use an RDBMS to store the parse tree of a compilation.

2 more replies

eddieroger12y ago

I'm with you. I love SQL. I find it surprisingly expressive, despite having a rigid syntax. Some of my favorite projects that I've worked on thus far into life have been heavily SQL-based, and no successful code run has ever matched the thrill of seeing a SQL statement return what I expected.

I'd venture to say NoSQL has gained it's popularity because it's Magic(TM), and it works well with equally magical frameworks. For frameworks that don't have generators like ActiveRecord, NoSQL replaces the need to boringly write your models and database logic. It's like every Getting Started document on the Internet is having a race to the bottom, and people are falsely equating a quick start with a good framework. I have never picked a technology to work with because I could have a blog written in 10 minutes.

vertex-four12y ago

> I'd venture to say NoSQL has gained it's popularity because it's Magic

No, it's because to a first approximation, they aren't magic. Writing your queries in a more imperative style means that most developers can more easily reason about them, doubly so once they get more complicated.

You don't have to rely on generators or magic frameworks to generate SQL for you with a good document database. All your queries can quite reasonably be right there, and things like "get the document with the ID of foo" are already just one method call away.

Of course, most NoSQL databases have tradeoffs that are not obvious, while I believe all SQL databases stick to ACID. That's probably more of the problem than what query language is used.

2 more replies

tjr12y ago

After studying SQL at MIT with Greenspun, I came to see it not only as a nice language, but beautiful and expressive in a way similar to Lisp. I personally would rather write database applications mostly in SQL than pull as much as possible up into the scripting language through ORMs.

jsemrau12y ago

Popular myth. Definitely. There is hardly anything as beautiful and simple as SQL.

Amezarak12y ago

Agreed.

Years ago, when I took a database class in college, I hated it. Several years later, I began working at a consulting firm on the support team. My primarily role was defect/enhancement tracking and resolution in the product we had developed for them as well as their satellite apps.

The first time I really got my hands dirty with SQL was pulling out a bunch of VBA and Access "SQL" they were using to generate reports and to rewrite it into real SQL so it could be run on the backend, which cut times down from hours to milliseconds. It was okay (I'd hate those queries now) but it's where I really picked it up.

Then the same client started calling me to write (or edit) queries to pull out some information from their database I don't think anyone had anticipated they'd ever need. That's when I started to really appreciate how beautiful and expressive SQL can be. Stop thinking imperatively - SQL will let you do it but don't. No cursors unless you absolutely have to do. You probably don't. Dynamic SQL has its place, but try to stay away from that too.

There are a few queries I still remember with some fondness. One was to determine the number of days particular classes of employees worked on particular classes of projects. Unfortunately, the way their data was modeled made this not-at-all obvious - and the date information was stored in the "start" and "end" date type fields.

There was a lot more to it, but after pulling my hair out - the customer generally needed these in under half an hour - I suddenly realized I could use a recursive CTE to generate a dataset of all days that I could use to join with, filter out any duplicates, and then use aggregation functions to get what I needed. There were a lot of other details because of the complexity of their model, but it worked and it just seemed so elegant at the end.

I've moved on to another job now and honestly, writing SQL is what I miss most. I would have never pictured myself saying that before I worked there.

nknighthb12y ago

I'm probably one of those programmers you think "use SQL quite happily". You're wrong.

SQL is an abstraction with myriad problems in both its core design and its disjointed implementations. ORMs and similar interfaces are crappy abstractions on top of a crappy abstraction. You get the problems of SQL combined with the problems of the ORM.

Preferring to remove a set of problems and be left with only the problems of SQL is not the same thing as liking SQL, it is merely preferring the lesser of two evils.

mamcx12y ago

In what sense SQL is bad?

3 more replies

tbrownaw12y ago

I mostly like SQL.

The biggest annoyance for what I do, is that I don't have a clean way to edit a select list; that is "select foo., ... from (...) foo" works to add* columns, but I can't remove or rename or replace existing columns without listing all of them.

The second biggest annoyance, is the hoops I have to jump thru to join a parent to two (or more) sets of child rows without either multiplying everything or dropping data.

daleharvey12y ago· 5 in thread

There is some irony in opening the post proclaiming how important SQL is because of webSQL, webSQL has hung on by being the only choice with Safari, with Apple almost certainly adopting IndexedDB webSQL will cease to exist in any relevance pretty quickly (its already fairly niche).

This post is just filled with naive strawmen arguments, I work on a NoSQL database, show me a SQL database that is implemented embedded inside every browser and allows offline operations to sync between masters transparently ...

Trying to define or dismiss 'NoSQL' as a singular movement is not going to work, there are lots of tools, I use different ones for different things, some of them even have a SQL interface.

zik12y ago

NoSQL is a strange term. It really just means "everything else that isn't a traditional SQL database". It encompasses a massive range of completely different database technologies.

SQL has been astonishingly successful in recent years and it's a testament to its dominance that the term NoSQL even exists. It's kind of like calling all meats except for beef "NoBeef".

I'm really happy to see some different database technologies getting some attention these days. They each have their own niche. There's no SQL vs redis vs MongoDB vs cassandra vs whatever "winner", just databases with different strengths which can benefit us in different ways.

Pacabel12y ago

I don't think that's necessarily the definition of "NoSQL database".

Databases that predate Codd's work, or long-established ones like dbm and BDB, may appear similar to modern-day NoSQL databases in how they operate, but they surely aren't the same.

Those systems couldn't use relational theory or SQL because they didn't exist yet, or at least didn't reject them outright as one of their main goals.

NoSQL databases, on the other hand, are completely about rejecting relational theory and rejecting SQL. That's at the very core of their philosophy.

True NoSQL databases have been developed as a reaction to several things:

1. A very, very, very small number of situations where relational DB systems cannot easily scale.

2. The far more widespread ignorance of the basics of relational theory, and a lack of willingness to learn about it.

3. The far more widespread ignorance of SQL, and a lack of willingness to learn about it.

4. An urge to be "different" solely for the sake of being different, even if it brings no technological benefit.

5. Unmitigated hype surrounding the term "NoSQL".

Of those, 1) is perhaps the only legitimate reason for using a NoSQL approach today. The number of times this sort of a situation actually arises is remarkably small.

The other four are why those databases have become more widely used, especially within the web development community. As anyone who has dealt with such systems knows, they're rarely about safely and reliably storing and managing data, and they're rarely about doing this efficiently. They're merely a shortcut that some developers use to avoid learning how to use a RDMS.

2 more replies

chrisdevereux12y ago

If you have certain religious beliefs then the concept of NoBeef or NoPork is an important one.

I guess that databases must be the same for some people.

1 more reply

threeseed12y ago

Exactly. Anyone who tries to compare SQL and NoSQL is frankly an idiot. Cassandra is as different from MongoDB as it is from Oracle, Riak, MySQL or SQLite. They all have very different behaviours and many model SQL concepts quite closely. It just doesn't make any sense to group them.

Funnily enough it's always the PostgreSQL community that wants to create this artificial war. It's like they think everything in terms of one or the other when in fact most systems these days are heterogeneous.

rejschaap12y ago

What do you mean? Postgres is probably one of the most popular NoSQL databases. Its SQL-support is among the best as well.

SchizoDuckie12y ago· 3 in thread

A project from 2010 that's stalled half-way through in alpha state.

Still, he is quite correct. You still need a Structured Data Query Language.

mantrax512y ago

SQL remains a useful tool in the box, but while SQL and NoSQL are fighting it out to be the default go-to database technology, something else on the horizon is threatening the concept of having any database separate from the app at all: persistent in-application state.

We don't need a database if we can just load the application state in RAM and save it back via serialization. This works well in surprisingly many cases.

_delirium12y ago

That's a fairly traditional no-DB approach [1], but what's the reason to believe the dominance of that approach is on the horizon?

[1] Common Lisp has a particularly nice system for doing it transparently, http://common-lisp.net/project/elephant/, and I believe HN is implemented using a DB-via-serialization strategy as well

1 more reply

nawitus12y ago

The problems with that are easy to see though: you need lots of RAM if your database is non-trivial in size. You also need to write the state to the database continuously to prevent data loss from failures. Not saying the idea doesn't have use cases, of course. Redis (for example) is pretty popular these days, but I don't see how it'll become the default go-to database.

1 more reply

nawitus12y ago· 2 in thread

>The NoSQL movement is partially a reaction to antiquated database servers, and also a response to a fear of SQL borne from ignorance of how it works

I think the NoSQL movement was created for practical reasons. Since SQL didn't scale properly for certain applications, new kinds of databases had to be invented. What often happened is that a software used a typical SQL database with locks, and as that didn't scale, the locks were decreased to a point ACID wasn't theoretically guaranteed anymore. And it worked.

If the database in production doesn't guarantee consistency, you might aswell design a new database which is based on that idea, which is a core reason NoSQL databases were created. There's other reasons, too, of course..

jsemrau12y ago

"Movement" is a bit much, no? It is a solution for a niche problem.

Pacabel12y ago

"Movement" is a good term for it.

There are a lot of people out there who have a quasi-religious attitude toward NoSQL databases.

It's a cause for them to rally around. In some ways, it gives them something to "fight" for.

Objective consideration about whether it's the best technology to use in a given situation is often disregarded.

There are a very small number of very rare situations where such technology is the best or most feasible approach. Otherwise, it's merely something that a lot of people get involved with in order to intentionally avoid learning how to use SQL and relational databases, or to feel like they "belong" to some greater cause.

1 more reply

dorfsmay12y ago· 2 in thread

He is right, the issue has been servers not scaling horizontally.

SQL the query language is great, the concept of set theory applies really well to data. SQL can be and is used on non-SQL database, impala for example implements SQL to query data burried in hadoop.

NoSQL stores become popular because they scale horizontally (able to use more than one server) naturally. Once the SQL servers can do automatic partitioning, I suspect people will start migrating back.

The truth is that there is no pixie dust, at the end of the day you need to index. I see nosql proponent having the same strugles sql people have with indexing, but right now they have the upper hand because they can spread the work over several servers.

mattdeboard12y ago

I guess this might explain some of its popularity but I think it has much more to do with the fact you don't have to do any advance planning about your data (i.e. no need to write schemata). "Rapid prototyping" is much easier when you don't have to think about the your data, its types nor the relationships between them.

Of course, moving out of this phase becomes a massive headache, since basing your product on essentially unstructured data is a very good definition of "technical debt". And if you're using structured data in your rapidly prototyped object, why not just a RDBMS in the first place? :)

vertex-four12y ago

> Of course, moving out of this phase becomes a massive headache, since basing your product on essentially unstructured data is a very good definition of "technical debt".

Of course, it's possible to use things like JSON-Schema to validate your data if you choose to.

> And if you're using structured data in your rapidly prototyped object, why not just a RDBMS in the first place?

Because where's the RDBMS with a "natural" query language that is well-suited to complex, dynamic queries and document structures? How do people model a document store, with the ability to point to differently-shaped data, in an RDBMS without giving up all that safety?

1 more reply

SimeVidas12y ago· 1 in thread

> Or Why You Still Need SQL

Great way to make people close the page immediately after reading the title. (I personally don't need SQL for my small web app => the title's lying => close tab).

jimktrains212y ago

I don't need a hammer to screw in a screw, so I don't need a hammer. QED.

vertex-four12y ago

Personally, I find that SQL is hard to generate. It's hard to reason about the creation of dynamic queries, say, for some types of searches. It's not reasonable to create a generic function that can operate on any same-shaped piece of data. And it's ridiculously hard to manage the concept of a pointer to a piece of data which could be one of many types, a problem I'm sure most people have faces.

Stored procedures can help with some of this, but then you've got to remember to update them every time you need to access data in a slightly different way, and it's hard to figure out when you can get rid of old stored procedures. Tools for managing them aren't up to scratch with tools for managing "real" programs.

Personally, I'm keen on RethinkDB. It's a document-oriented NoSQL database which doesn't lose your data, has built-in clustering, and an extremely strong query language built on chaining method calls. While it doesn't have transactions, the query language is powerful enough that you can easily model most forms of deterministic data transformation within a single query.

bitL12y ago

I'll just put a few points here to understand when you should use which architecture.

NoSQL:

- reading is fast, writing is expensive (if all data are pre-processed/denormalized for reading during the writing phase)

- often schema-less

- low latency (for key-value storage)

- offline batch processing (classical Map Reduce)

- no ACID, choose 2 of 3 in CAP

- demanding on SW engineers to get client-side conflict resolution, tricky in general

- Petabytes of data can be suddenly processed

- huge variation of different paradigms, key-value, document, graph, batch etc.

- haywire indexing

SQL:

- writes are fast (normalization), reads are expensive (JOINs)

- ACID (well, only to some extent, clustering messes up many ACID properties unfortunately and conflicts arise in corner cases)

- set operations and a neat math theory behind them

- stable indexing, easily constructable real-time JOINs

- OLTP

- easier for developers

- non-flexible schema

- tradition, well-known recipes on how to do things

Basically, if you want to have low-latency access, your concurrency model allows eventual consistency, or you have a need to store your data in non-standard structure such as graphs/trees, use NoSQL and pre-process all data to be exactly in the format you require for reading.

If you need 99.999% guarantee of consistent data, amount of data you need to handle is under 50TB, you can put your data into a fixed schema and latency doesn't matter that much, use SQL.

I would recommend you to ask yourself a question - is your app/business read-heavy or write-heavy and decide accordingly.

j / k navigate · click thread line to collapse