SQL is 43 years old – Here’s why we still use it today (opens in new tab)

(blog.sqlizer.io)

588 pointsd4nt9y ago382 comments

382 comments

217 comments · 50 top-level

konradb9y ago· 28 in thread

Can anyone shed light on why there has been a phenomenon of people finding SQL 'too complex' and moving to noSQL? (Not sure if that's entirely fair but, from the outside, it is what it looks like). Is it hype driven? Are courses at university not tending to cover SQL that much?

charles-salvia9y ago

The main reason is the object-relational impedance mismatch[1]. Basically, programmers like working with objects that have data fields. This is because most modern, widely-used programming languages treat objects/aggregates with data fields as a first class concept. But SQL isn't designed around objects with fields, it's designed around tables, rows, and result sets from queries. Therefore, working with SQL in most modern programming languages generally requires layers of annoying result-set->object or object->row plumbing/conversion code. (Not to mention the vagaries of type conversions.) Of course, these days, this problem can be substantially mitigated to a certain extent by clever ORMs, but an ORM is generally a leaky abstraction at best. Obviously, whether or not any of this bothers you will depend on your use cases and a lot of other factors.

[1] https://en.wikipedia.org/wiki/Object-relational_impedance_mi...

MarHoff9y ago

But SQL isn't designed around objects with fields, it's designed around tables and rows.

I think a more correct analogy would be that table are like classes, columns are the properties, and rows are instances. And so defining foreign keys is like setting a pointer to a parent instance.

There is not direct analogy for methods, but you can use function/trigger to do the same job.

PostgreSQL is actually an object-oriented RDMBS, it's not because you are meant to manipulate these objects through SQL that they are less powerful. And SQL is actually Turing Complete with PostgreSQL.

It's clearly not convenient for general programming, but as soon as data manipulation is involved you benefit from a lot of built-in optimization.

3 more replies

corpMaverick9y ago

I spent years trying to hide the database. Now days I just embrace it. The database is there. Tables and rows are as real as invoices or purchase orders. The database will still be there after my programs and I are gone. I use OOP to construct good gateways into my tables.

2 more replies

enobrev9y ago

I remember a few years back - I don't remember exactly when, but i'd say around '07-ish, definitely a couple years before MongoDb appeared - some of my peers who were highly proficient in Relational Databases began denormalizing their data to meet higher demands than we were used to. The basic idea was to turn tables into Key-Value stores and put Memcached in front of them. The higher the traffic, the less normalized the data became.

Around this time, NoSQL databases started popping up, and a lot of my colleagues moved on to them since it fit so well with the trend toward denormalizing everything. Some loved them and dove in full-force.

Personally, I kept working with normalized data, used a caching layer to handle the denormalized versions of the data and learned more about scaling with Master / Slave configurations, and honestly felt very much like I was being left behind.

In order to see what the fuss was about, I tried a small personal project with MongoDB (this is pre-redis, I think), and honestly enjoyed the simplicity of it. And then my toy project got a sudden popularity bump from 20 users to 40k in a day and my project just died. I spent two weeks trying to keep up and then kinda got it working, but couldn't keep the site alive for more than a day. And since it was just a toy project, I just gave up completely.

I've only used non-relational stores for portions of projects that explicitly warrant them, since.

VLM9y ago

There is also the eternal rotation of the corporate wheel of centralization and decentralization, so when the official corporate HQ Oracle team inevitably becomes too slow and expensive and difficult and incompetent to deal with as part of the usual business cycle, non technical people will route around then and use Excel spreadsheets as their database, or dump json xml files into a filesystem, or use mongo primarily to keep the corporate Oracle team away, like how garlic keeps vampires away. And then when the decentralized solution falls over from poor design or lack of scalability or lack of reliability or lack of backups or lack of support personnel, adult supervision will return and there's a push to massively re-centralize into professionally engineered schema and indexes.

Another aspect of "too complex" is sometimes the true data structure of the problem is correct and also too complex. Some programmers bite off small chunks and chew on them, then push back that the entire data structure of the problem should become the small successfully chewed up chunk. A model that encompasses all of the concept of "number" is very complicated, so a programmer starts off writing a small simple integer library and pushes back that the definition of number should become the simple integer... then reality impacts and as the system evolves and demands are made, the concept of "number" needs floats, rats, complex, base conversions, maybe worse, and the original very complicated design is the only successful way to implement the business requirement of "number". Whoops. If you baked into the cake at the start how to handle rats with zero denominators life would be a lot easier and safer than bolting it on later or praying the app level code handles it, for example.

arjunrc9y ago

This first paragraph applies exactly to us! We're the offshoot now because of a bloated central IT system and now that we have a good thing going, there's a corporate push to modernize IT and bring back all our offshoots to one place.

Loved the vampire/garlic analogy!

jasode9y ago

>a phenomenon of people finding SQL 'too complex' and moving to noSQL?

Fyi... the "SQL vs noSQL" means at least 2 different ideas and some of your replies are highlighting one aspect but not the other.

1) "SQL vs noSQL" can mean "SQL syntax (e.g. joins) vs object syntax to save/retrieve documents (e.g. Javascript JSON/BSON)" . This is probably the main driver of MongoDB adoption. (They don't care about Mongodb's scaling aspect; they just like the easier syntax.) The counterpoint to this idea is that "people are using terrible db engines like MongoDB because they are unwilling to learn SQL syntax."

2) "SQL vs noSQL" can mean "OLTP RDBMS engine (e.g. MySQL/Postgre/Oracle) vs distributed db engine (e.g. Hbase/Cassandra/etc)". The counterpoint to this idea is that "people are deploying to distributed db engines when their use case actually fits in a 1-node RDBMS engine."

From your wording, it looks like you're talking about #1. To that point, many programmers don't like the compexity of 10-way joins of a dozen normalized tables to reconstruct a customer data entry screen. With a document-oriented noSQL db, you just retrieve the entire denormalized "document" with no joins. However, there are tradeoffs to the "easier" noSQL engine such as performance.

konradb9y ago

Hi - yes you are definitely correct to make that distinction and I was only talking about (1), having forgotten (2). I see your point about the large joins required to get the information you need for a screen.

VLM9y ago

I'd toss out a third explanation for nosql is its not really a language thing but a concept of normalized/denormalized thing.

Historically databases got used a lot where the short term cost of normalization was low and the long term cost of denormalized data was extremely high, like accounting or bank records or medical records.

The world has some data store applications in completely different environments, where the cost ratios are wildly different and the bandwidth load is extremely high. Like feeding the whole logging output of a webserver cluster into a DB for "data mining" or whatever. So minimum total cost in those weird new applications involves doing the opposite of what usually is remains correct in the older, still profitable applications.

Inevitably resume stuffing being what it is, neophillia, you end up with people trying to pound square pegs into round holes to boost their resume, look cool, gain experience, or for the sheer joy of trying something new. There's also the rush of transgressive behavior, short term thinking, etc. So naturally you get the extreme over-reaction of people arguing that corporate financial records should be nosql or your bank balance or credit record should be based on nosql, etc.

acdha9y ago

The two factors I saw:

1. Bad experiences with cumbersome ORMs and refusal to learn how databases work: basically fighting with performance or normalization problems and concluding the problem was the concept rather than using it poorly (“JOINs/subselects are hard so I'm doing 10k queries. SQL isn't web scale!”). A related component of this was optimizing for only part of the problem – e.g. someone's writing a web form and they found it appealing to slap arbitrary key:value pairs into a NoSQL store because they hadn't gotten around to writing the validation, reporting, etc. code which actually needed structure.

2. Hype, hype, hype: people would look at papers coming out of large places like Google, Yahoo, etc. with impressive numbers and think they needed the same infrastructure to impress the other cool kids, ignoring the fact that those papers mentioned traffic / data volumes many orders of magnitude higher and the huge companies could hire more engineers to make up for the extra time needed to hit that kind of scale. A lot of that thought was influenced by the pre-SSD era so the assumptions about when you exceed the limits of a single server really don't hold up to serious consideration now, or often even back then.

stillkicking9y ago

The promise of SQL is the ability to do arbitrary complex queries, but the key to making SQL fast is to tailor your indices and queries for the current use. These two are in opposition, so that most of what people do with SQL is do the same simple key-based scans and grouping of linear data sets that NoSQL excels at, except with a big wrapping layer of serializing in and out of normalized tables. Every time I've seen people try to write complex aggregation and filtering tasks in SQL, the resulting queries are either a nightmare to maintain or performance is abysmal.

Another problem is trying to store dynamic schemas in SQL, e.g. data for a CMS with user-defined entities. You either have to expose a version of SQL to the client, which is dangerous and/or a huge hassle (see above), or you have to implement SQL-in-SQL with rows masquerading as columns. Neither is ideal.

For me it was a much different problem though that sold me: SQL has no concept of revision control or conflict resolution, and all update tracking must be done in-band, with manually incrementing versions and timestamps. Doing master-master style synchronization between two SQL databases (e.g. server and client) is a pain. Doing it with e.g. CouchDB was a breeze, because of its git-like revision tracking. Being able to shunt JSON in and out was a huge benefit, as simple arrays and hashes do show up everywhere. Being able to suck in data from production into a developer database using its built-in replication features was great too.

If SQL serves your purposes well today, that's great, but there are plenty of reasons to want to move beyond it. I'd like whatever Post-SQL is to handle nested data types, particularly if it comes with functional-programming-style algebraic closure of the resulting constructs. SQL-in-SQL should just be SQL.

In the meantime, I will just design data-first and acknowledge that how I _pull out_ the data is the main constraint anyway, SQL or not. For arbitrary queries, there's dedicated indexing and searching solutions like ElasticSearch that can actually deliver on doing that efficiently, without lots of careful babysitting.

AndrewOMartin9y ago

As a guess, SQL and noSQL are different tools good for different things and before noSQL people were trying to hammer their document-store shaped peg into a relational-database shaped hole.

To speculate, noSQL key-value stores became very popular because they allow you to model imperfectly defined situations, and to update that model quicker than in traditional relational-database type situations.

Consider the difference between writing some classes in Java, or bunging everything into a dict in Python. The Java solution could be more formal and well documented, but a pain in the ass if the model changes significantly. The Python solution is a bit more janky and probably sluggish, but can adapt quicker.

I think there are more people in things like research, self-teaching, iterative game dev, and exploratory startups that value the adaptability than there are in things like old school business consultancy for whom the underlying model might be static and well described.

jaredklewis9y ago

As the web grew, so did the scale of websites.

Many sites and apps found themselves in a situation where they would gladly trade strong consistency, for more performance and eventual consistency.

Further, schema changes for relational databases with billions of rows have to be very carefully orchestrated to avoid downtime. Databases with more flexible schemas allow schema changes more easily (though obviously the code most accommodate the fluid schemas).

Various NoSQL solutions were developed to fill these needs.

Sure, they were overused for a bit (though that fad has basically passed), but they definitely arose to fill a real need.

TimJYoung9y ago

Schema changes, and how they affect up-time, are an implementation detail in relational databases. It's possible to design a relational database engine that doesn't require huge amounts of downtime to alter the schema of large tables. It's all in how the data is stored.

Of course, the reality of the situation is all that matters, and most database engines at that time did not offer such features. So, I don't necessarily fault people for trying to find solutions that simply worked for their needs.

JustSomeNobody9y ago

I think it's like everything else, developers see what the top companies are using and think, well, if it works for them it has to be the best (and you can search for "what's the best..." on any developer forum and see what I mean). So in that sense, it's hype driven.

I don't believe this is the best way to decide on a technology.

jfroma9y ago

This doesn't make much sense to me. I think the answer is just lack of experience.

"noSQL" should be read as "not-only sql" and I think sql in this acronym should be interpreted as "relational databases". I think the relational model is convenient for a lot of purposes but at some point (maybe ~10 years ago) some companies where trying to use it for everything even when those things didn't actually fit in that model. So, "noSQL" for me actually means "alternatives" for different kinds of data time-series, graph, key-value, documents, etc. with redis, neo4j, graphite, mongodb, etc.

What I would like to see in the future is some ecosystem based on small pluggable modules that you can use to build the embedded store for a micro-services. Instead of using a full-feature database system (relational or not) you assemble your store like Legos with only the stuff you need. Something like this https://github.com/level/levelup/wiki/Modules

dboreham9y ago

It is hype driven. Or perhaps driven by a failure to understand the subject, coupled with a natural tendency to look for simple, and ultimately trite explanations for everything.

bushin9y ago

The pot calling the kettle black.

metaphorm9y ago

relational data modeling is something you will learn from experience once you're developing an application where the business domain data is inherently relational.

everything you work on in school is not like this. you spend a lot of time studying in-memory data structures and doing tiny projects (ranging from 100 lines of code homework all the way up to 5000 lines of code midterm project) that implement in-memory data structures and algorithms to work with them.

so you get some inexperienced developer who only has their schooling to inform them and they're all stuck thinking in terms of arrays, lists, hashmaps, and trees. learning a new data model has a lot of cognitive overhead so the allure of a data persistence tool that gives you almost the same data structures you already know is very strong. "Why would I learn about relational data modeling when everything is just a hashmap anyway?" thus MongoDB as extremely popular with 23 year olds and extremely scorned by 33 year olds. 10 years of job experience drives the point home.

hackits9y ago

For me personally, relational data model didn't really make much sense until I studied relational algebra. After over-night my supervisor was amazed and wanted to know why my quires went from shit to amazing in their eyes.

sixdimensional9y ago

"NoSQL" databases existed before SQL. For example, look at multivalue databases like Pick/D3, or the sparse map like persistent data storage structures of MUMP. We had plain text databases, key value storage, hierarchical databases or multivalue databases before we had relational ones.

Actually, pre-RDBMS those were the dominant systems. The concepts invented from RDBMS were the new kid on the block that was hot and cool and more powerful that supplanted the original "NoSQL" system and had remained dominant until we hit "Internet scale" and people wanted to 1) scale to huge data volumes requiring distributed databases and processing cheaply and 2) wanted to return to finding alternatives to the dominating SQL platforms.

erikb9y ago

Actually it was an attempt to get better performance when the sizes of data increased drastically. Then of course it became a trend and developed some ideas by itself that are unrelated to where it came from.

flarg9y ago

Just from my pov, SQL does not support a document view of data and does not easily support versioning of same and this is a huge problem certainly in complex business systems.

dragonwriter9y ago

> SQL does not support a document view of data and does not easily support versioning of same

SQL supports linear ordered or even branching versioning fairly easily (though only recently have many SQL-based DBs had decent tools for temporal versioning), and SQL-based object-relational databases (Postgres and Oracle, for example) have supported both document-oriented views over classical relationally-structured data and document-oriented data storage since well before the NoSQL craze.

kwillets9y ago

Honestly, I think there's a cargo cult of people who want things to be impressively complicated.

tkyjonathan9y ago

I think you missed the part in the article where it said “SQL - it’s so easy marketers can learn it.”

bitwize9y ago

Because MongoDB is web scale. It doesn't use SQL or joins and that's the secret ingredient in the web scale sauce. That and sharding.

mdpopescu9y ago

I think the people downvoting this haven't seen [1] :)

[1] https://www.youtube.com/watch?v=b2F-DItXtZs

1 more reply

cyberferret9y ago· 26 in thread

Been programming for 30+ years, and 99% of my projects use SQL databases. I've tried and dropped NoSQL many times. I still wake up in a cold sweat thinking about the earliest version of Firebase where I had a project that tried to join three tables together to get some meaningful data.

I still remember the response of a Firebase team member to my forum question about it - "These days, storage and computing power is cheap - just duplicate the child table as nodes in the main table and do it all that way for every parent/child relationship you have. Don't worry that you have to duplicate the same set of data for every parent that is related to the same child...That's how NoSQL works..." <shudder>

Even though I use ORMs in my project these days, every time I have to test a complex query, I write it in raw SQL first and check it before trying to make the ORM duplicate the same query.

Granted, NoSQL has its place and its advantages, but for me, when it comes to "money code", I will stick to SQL.

matwood9y ago

I feel like this is something I have to keep repeating, "the data always outlives the app code." If app code is required to make sense of the data, you are going to have problems. Of course NoSQL has a place, but only in very certain cases should it be your primary datastore. SQL (and the RDBMSs that leverage it) are built to store, protect, and provide structure to the data.

The second annoyance I have is the push for schemaless. Schemaless does not exist. There is always a schema, except in a schemaless data store, the schema has been moved to app code and then my earlier comment applies.

One thing I do agree with is that most ORMs are not very good. The best ones I have used are very thin layers over the sql, like jOOQ. I think the lack of understanding of SQL and bad ORM experiences (Hibernate WTF) is what led people to think SQL/RDMBs were the problem when in fact they were not.

fnord1239y ago

>The second annoyance I have is the push for schemaless. Schemaless does not exist. There is always a schema, except in a schemaless data store, the schema has been moved to app code and then my earlier comment applies.

This is a great point. It's like types for programming languages. They always exist; it's just a matter of whether you have the compiler managing it or whether you need to keep it in your head when you're hacking.

> I think the lack of understanding of SQL and bad ORM experiences (Hibernate WTF) is what led people to think SQL/RDMBs were the problem when in fact they were not.

Again, like types in programming languages, I think the ability to iterate quickly without having to predict how the data will be used in the future leads to 'schemaless' approaches which can be kicked out the door more quickly (maybe) than ponderous schema laden tables.

4 more replies

jasode9y ago

>Schemaless does not exist. There is always a schema, except in a schemaless data store, the schema has been moved to app code

I get what you're saying but keep in mind that "schemaless" isn't a philosophical statement about denying ontology[1]. Instead, it's an industry label for avoiding database schema operations. E.g. a rapidly-evolving app doesn't know ahead of time all the rigid columns they need so they use "schemaless" db strategy to avoid "ALTER TABLE ADD COLUMN X" or avoid an export/import to new v2,v3,etc tables. From the relative point-of-view of the db engine, it doesn't see a "schema" when the so-called "columns" are embedded as strings inside of generic fields.

[1] https://en.wikipedia.org/wiki/Ontology

4 more replies

techno_modus9y ago

> Schemaless does not exist. There is always a schema...

I also think so but the issue has several important aspects which justify the use of the separate term "schemaless". One of them is that schema elements (say, column names) can be stored as normal data. For example, instead of having normal columns like Name, Age, Department, we could introduce a column storing these strings in 3 rows. As a result, DBMS is simply unaware of the schema - the schema exists only in our head (and in the app). As a consequence, DBMS cannot help us too much in managing data, and instead our app becomes responsible for these tasks, hence we get problems you mentioned. But the major problem is that currently there is no technology that allows us to say that this table column stores actually column names which can be used in queries and have to be treated as normal columns.

3 more replies

fortpoint9y ago

I saw Mike Stonebraker while he was on his VoltDB "tour" a few years back. He makes the same arguments. In his talk he walked through the history of database technology and discussed a similar "war" between CODASYL and SQL. He framed that debate in terms of the split between separating the concerns of your data's structure (schema) from application logic.

Fairly or not, he suggested the current fight between noSQL/Schemaless vs SQL/RDBMS was being fought in ignorance of all that went on in the 70s.

emilecantin9y ago

Yeah, I think the good ORMs make the simple case _really_ simple (like User.getById(1234)) while letting you drop down to raw SQL easily when you need it. Any ORM that doesn't let you do that is not worth using, IMO.

ZenoArrow9y ago

Thank you for recommending jOOQ. I find many ORMs seem unnecessarily complicated compared to raw SQL (with a few exceptions), nice to find another one that offers readable code. Is the performance generally decent too?

1 more reply

ams61109y ago

Agree so much.

The first thing I do for almost any new system is design the data model. Once that satisfies all the requirements, building an application and UI on top of that is usually pretty straightforward.

1 more reply

sixdimensional9y ago

I find a simple analogy for schemaless in the OO world as well - an object without a class definition (for example a dynamic/anonymous object) and/or an interface has unknown properties and methods and is thus of highly limited use.

Just like a table without a schema... because after all, it's still likely objects all the way down.

There is no such thing as working without a "schema", because software simply can't function without data and corresponding metadata to describe it.

lukaseder9y ago

> There is always a schema

It has a name. Schema-on-write (static schema) vs. schema-on-read (dynamic schema).

Sir_Cmpwn9y ago

I ended up using NoSQL for ephemeral data and SQL for persistent data in most of my applications.

gregmac9y ago

I've largely abandoned ORMs that generate SQL queries for me. A mapping layer that takes a SQL result and stuffs it into an object (DTO) is useful, but anything beyond that is more trouble than it's worth.

For complex SQL, it's better to write it by hand. You get to tune it and ensure the indexing is correct (either the query usage, or create/modify your indexes), and you get to see exactly what it's doing.

For even simple conditional joins, there's something nice about knowing exactly what's going on, and knowing it's not going to be doing something that results in a nested join loop (eg: the type of slow query you don't notice until you're using it in production and performance suddenly drops).

I'm a big fan of Dapper for this, so a lot of my data layer code looks like:

    public IEnumerable<Order> LoadOrders(int customerId) 
    {
        using (var db = GetConnection())
        {
            return db.Query<Order>("SELECT * FROM orders WHERE customerId = @customerId", new { customerId });
        }
    }

The one exception I make to ORM-generated code is INSERT/UPDATE queries. For most cases, the ORM doesn't have much it can screw up, and since it's essentially just mapping code ("Name = @Name, Address1 = @Address1, etc") it's more likely I'll make a typo or copy/paste error than anything else.

I've spent too many hours debugging crappy ORM queries, and I find it generally takes twice as long (with quadruple the frustration) to get the ORM to generate the complex SQL you want it to vs just writing the SQL.

senorjazz9y ago

I fully agree. For simple things ORMs can be useful (very simple) anything else they get in the way.

Firstly, have to write out the SQL, then convert it to the ORM format, only to find the ORM doesn't support something you are using, so you rewrite the query in a more long winded way, then convert it to the ORM.

2 more replies

swalsh9y ago

This is funny to read, I've recently started giving up on ORM's, but I felt really guilty about it. Like I have to really justify myself. Like you, I use it in a hybrid manner, simple queries I still use it... but for complex selects, I have yet to meet an ORM that can dish it out as good as I can.

1 more reply

electrum9y ago

That looks very similar to JDBI for Java: http://jdbi.org/

ReidZB9y ago

At my workplace, we've started using jOOQ for this sort of thing. https://www.jooq.org/ (albeit we're a Java shop, not a C# shop)

It still sort of generates SQL for you, but really you end up writing the SQL yourself except in a type-safe way.

overcast9y ago

All I want is a modeling layer, and callbacks, like thinky.io provides for RethinkDB. It's very rare when any query I'm writing can just be generated by one of these ORM packages. It just becomes more confusing, and less powerful.

cryptonector9y ago

Even for data modification statements, if you can use TRIGGERs, CHECK and other constraints, and/or stored procedures / functions to implement business logic, then you'll be better off doing that than using an ORM.

cryptonector9y ago

Typically there are two arguments for NoSQL: "no SQL!" (i.e., "the SQL language is so ugly, difficult, and painful") and "no ACID" (i.e., eventually consistent, hopefully, maybe).

The first one is always demonstrated to be a terrible argument when someone creates a SQL alike for whatever NoSQL we're talking about.

The second one has been a more durable argument. But transactional semantics clearly are independent of query language! And technologies like Lustre and Spanner show that one can still have some measure of traditional transactional semantics in distributed systems (filesystems and databases, respectively). There are many applications where some degree of "eventually-consistent" is a great tool for making them scale, but this is often a function of what can be done in the event that inconsistencies create problems (e.g., a store selling more items of some product than it has available, a case where the store can refund the customer or delay delivery).

I'm extremely skeptical of NoSQLs, as you can tell. I would say that NoSQLs have NoPLACE. (Certainly as to that first argument.)

I'm also extremely skeptical of ORMs. So far as I can tell ORMs only ever add a layer of headaches. They seem to aim for simplifying the DB experience for developers and users, but the moment you step out of the small world of queries made simpler by the ORM... you're in for a world of hurt.

Zak9y ago

I'd actually quite like a relational database with a language that isn't SQL. I find SQL's syntax pretty irregular and awkward (e.g. update having a different syntax from insert), find the tools available for abstracting and composing queries (views) underwhelming and hate that schemas aren't declarative.

The first complaint derives largely from the SQL language being intended to read link English text. Languages like this show up regularly, seemingly based on the misconception that syntax is what's hard about programming. Most of them fail, as they should. SQL didn't, because it was tied to the exceptionally useful relational database.

The second is best illustrated by showing an alternative approach as an example: Korma for Clojure (http://sqlkorma.com/). It's easy to compose queries from simpler queries, just as we compose functions from simpler functions. Views provide some of this.

The third, if it's solvable would make using relational databases much nicer. Instead of schemas being manually-created diffs full of imperative statements like create and alter, make them declarative and have software generate the diff. If transformations on the data itself are required between versions, require it to include a pure function describing said transformation. By default, require a function to transform it back as well unless it explicitly says what data is to be discarded.

4 more replies

samirillian9y ago

To be clear, though, you're talking more about the Relational Model than SQL per se.

cryptonector9y ago

There aren't very many query languages for relational DBs that come close to SQL's power.

pscarey9y ago

I'm currently using Firebase at work (startup, mobile app, web dashboards), and I think it's got some great features.

It's not a replacement for SQL, but an entirely different product (and has no doubt come a long way since V1, especially with Firebase Functions out recently). You have to make tradeoffs, like data duplication, for the advantages. From the sound of it, the advantages probably aren't relevant to you, but I'm confident they exist for us.

- We only have to store a comparatively small amount of user data for the lifetime value of each user, so the actual database cost is marginal (i.e. data duplication is not expensive to us, as long as reads are fast). (Development) time is (very) expensive.

- Rapid development and iteration. I can make additions and make fields redundant with ease in response to customer feedback, my own development choices, and changing product needs. There's no API update needed, as reads are client side. I have data model classes which store the reference in the database, some constructors, and getters and setters (About the same as any other data model + API would have). The release goes live and data just starts being manipulated in a new way. Alterations can have simple defaults with || in JS and ?? in Swift (even if it's just an empty string).

- Minimal backend. Our application doesn't require much backend logic, and that which we do have is mostly event driven with Firebase Functions, upon certain database writes. Works out quite cheap too (for now - most usages don't trigger any functions).

- Caching, and offline first. We take the approach that it doesn't matter if there's a short delay in 95% of our data updating - either due to patchy mobile signal or the data taking time to sync down. Firebase has made on device caching and real time updates to that data a breeze. With GeoFire, we've got instant map search (keys are cached by Firebase and contain sufficient information to filter, etc).

- Declarative security rules are very powerful, and can even reference values in the database itself for security control - effectively. We store a permissions tree in the database for our more complex security logic.

As a bonus, additional features like Authentication, Analytics, Push Notifications, etc are all convenient to have bundled up.

As you can see, most of this isn't about the database tech, but the development speed for us. The perceived costs and inefficiencies may exist in the database itself, but there are massive advantages for the application development as a whole.

cyberferret9y ago

I should probably clarify that I really liked Firebase, for all the reasons you mentioned above. It was really the schema restrictions, and (at the time) the lack of relational functionality that made me scrap it for a mobile app project that I was working on at the time.

This was very early days too, and I believe Firebase has improved considerably since then, and indeed I did a little "fun" project in it recently [0] (directly related to HN actually) and enjoyed the process.

I may still revisit it sometime, and see if it will suit another project.

[0] - https://hackernoon.com/tophn-a-fun-side-project-built-with-v...

pkulak9y ago

Most projects I write eventually turn into an RDS and Redis. Storing non-relational data can be super handy, and putting it somewhere where you know it can disappear at any point keeps you disciplined. haha

manigandham9y ago

What does this have to do with the article? SQL - the query language - is completely separate from any database that implements it.

ransom15389y ago· 19 in thread

SQL has the great side effect of creating structures that future employees can understand. Its a set of tables with relationships. Given these you can quickly inspect the structures independent of the programming language dejour. With a few commands a new employee can understand: business logic, hr, billing, reporting, and other major backbone principals in a matter of hours. On the other hand, I have been at companies that jumped on the noSQL / ORM / wrap it until you can't wrap it / hide sql (rails). When new employees show up... well... its a bunch of semi/no structured stuff spread across thousands of lines of specific logic.

pjmorris9y ago

"Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious." - Fred Brooks, The Mythical Man Month.

vog9y ago

Note that "tables" in this statement had a different meaning than today. That one wasn't about relational databases, it was about tables as in-memory data structures.

In today's words, "flowcharts" means "code" and "tables" means "data structures".

I don't care if it is:

- tables in a relational database

- nested structs (records) and lists

- nested dictionaries (hash tables) and lists

- JSON

- XML

- ... whatever

What I do care is that I can see the data structures, and not just the code. Static typing often gets that job done fairly well. If you don't use that, please use at least type annotations. This is the most important part of your documetation, and the compiler ensures it remains correct over time. Most of the time, this (+ the function name) is the only documentation I really need.

1 more reply

scarface749y ago

Show me your well structured code and well named unit tests and I won't need your flow charts.

Show me your well defined normalized tables and I have all I need.

But for the love of $entity please don't show me tons of business logic in stored procedures. I actually won't work for a company that expects me to build or maintain a product based on stored procs.

2 more replies

mannykannot9y ago

I have seen at least one case where I could ascribe no meaning to several of the tables in a database. To be fair, seeing the code did not help either, until it became apparent that it was all an attempt to work around a fundamental error. The creator of this mess was blaming the users for it not working, and I was happy to refute that claim and exonerate the users.

forgotpwtomain9y ago

Rails tried to hide SQL so much that one of the consequences was not supporting proper table level constraints (FK) till AR ~4.2 or so. Just take a look at the DB of a company using Rails in production: hundreds of keys to non-existent relations.

geebee9y ago

I regularly use raw sql in rails. ORMs are fine for convenience, but for me, they never make or break an app. They can save you some typing on very basic stuff, but if you find you're writing complex ORM code, I think it's a good idea to just use sql directly.

jmfurlott9y ago

This fact alone, after several years of changing schema and migrations, led to us having major rewrites and porting over to much more hands off approach. It's an understatement how much AR has led to just confusion especially with new employees!

jgeraert9y ago

Seeing this as well in proprietary software sold by one of the biggest erp software vendors... And it's still not fixed and likely won't be fixed anytime soon.

2 more replies

elsurudo9y ago

That's true, but you could always either create them yourself in SQL in a migration, or use a gem like foreigner (as I did). Always use FKs...

jrochkind19y ago

You put your business logic in the rdbms? As constraints and triggers and stored procedures and such?

There are advantages to that, but pretty big disadvantages too. For maintaining and developing a non-trivial app, SQL, or a particular vendor's SQL variant, is definitely not my preferred platform. I definitely don't find that something easy to understand coming to an existing project that's been developed like that. I guess it _could_ be a matter of taste and experience, if it is yours!

hackits9y ago

Most of the time when business logic is put into triggers it turn into a disaster. Even Oracle recommends only putting simple logic within triggers and for the most part should only be used for auditing purposes.

For example too many times I've updated a record with a new column value only to find out that the value I've updated because of some trigger caused the value to be set to null.

cryptonector9y ago

One advantage: you get all the write concurrency your RDBMS can give you for your transactions. Too many ORMs I've seen just have a Big Lock, leaving you with just one writer at a time.

I can't think of disadvantages to putting your business logic in the RDBMS. Can you elaborate?

1 more reply

pasta9y ago

I think he meant putting constraint and index logic in the code.

For example: I've seen people rely on a unique index in the code. But if you set a unique index in the DB you know it will be unique always. Even when the code is replaced some day.

Sean17089y ago

I'm not sure I'd call constraints business logic, and I would definitely want to at least double-check them in the database.

cr0sh9y ago

I could be wrong, but I think the main reason business logic is put into the DB (and provided that the "users" of the DB are also segregated properly), is to present a cohesive view to all users of the DB.

Too often, when you have a "wide open" DB and multiple "users" of that DB (think groups within an org) decide to query it (whether just for reading, or worse, updating), the queries can often turn out to be radically different for the same business logic.

Maybe one development group thinks that a query should be done one way, while another thinks that to get the same data it should be done another, and now you have two or more groups disseminating data to other groups in totally different ways.

You may ask why the organization has multiple development groups, but it occurs. I worked for one company I won't name because it isn't important - as a web developer in their marketing department; our group was considered as separate from the IT group, which handled the main "DB" which was based around AS/400 systems and Apache SOLR - we worked together as best as possible, but also bumped heads, in that they wanted us to only work with their "DB" (I don't really consider SOLR to be a DB, though it can and is used like one by a variety of orgs) through their interface (which at times didn't work like they documented it - and many times changed without us knowing about it until our stuff broke mysteriously - usually around the end-of-year holiday push) - whereas we needed to store a lot of the stuff "nearline" in a MySQL DB (we used MySQL, PHP, and AWS for much of our development) to make things more responsive for our end users (ie - the people buying the products the company made off the websites we were creating). In essence, we a bit of a "split personality" going on - but our main boss was only two levels removed from the CEO of the company, so our stuff generally was tolerated - but it is a similar situation.

Where you have multiple dev groups (whether by design or because "ad-hoc" things occur - like Bob in accounting figuring out how to use ODBC with Excel macros to query the database for data for his and other departments), you can have this kind of chaos. When the DB is heavily controlled by IT, with the only "views" of it through tightly controlled business logic which is part of the DB, this kind of difference in data can be controlled.

But it does have many more downsides, which also means it probably shouldn't be used or done in that manner. Instead, the interface should probably be through a single interface (RESTful or similar), with the business logic in the code of that interface, and only the barest needed other logic in the DB to tie it together. Provided that the security on the DB is tightly controlled, and the only way other groups can access the data is via the exposed interface API, you can achieve the same results I think.

In a way, that's how we had the access at that one company - we had a RESTful interface API to the backend SOLR store; we could send a formatted "query object" and get back one or more "records" as a stream of data (we'd then usually take the data, parse through it and store parts of it into our MySQL DB - because the query/response time of that SOLR DB was horrible from a web development perspective; I don't think this was the fault of IT, but rather the fact that the datastore on the backend was vast, holding information about products dating back 60 or more years - I'm sure there was likely some COBOL in the mix somewhere).

That did have the downside of the fact that if we wanted a particular means to query for something that didn't exist in the existing API, we either had to make due with what we could do "locally" (thru code and/or mysql "buffering"), or we had to put in a request for a change to the IT group (which may or may not get accepted, and might take weeks for the turnaround time before we could use it). Furthermore, as I mentioned before, there were more than a few times that we rolled out a particular feature, only to have the website(s) that relied on that feature break because the backend API changed "behind our backs" (and we usually saw this over a holiday period, when our sales would peak of course). In many cases, we couldn't do anything about this (not even storing the SOLR information - it was too vast, plus there was a nagging idea that if we tried that someone's head would roll for not using the implemented interfaces and data that already existed - we could "buffer" or "cache" things, but we couldn't wholesale transfer the data over).

brudgers9y ago

Exposing data structures in a human readable form was probably a necessity because SQL slotted itself into a space where COBOL lived...and one of the features of COBOL was its intent to create readable data processing programs. Like SQL, COBOL was generally a very reasonable alternative when practical engineering and business considerations were all on the table (e.g. mature well documented tools and the availability of employable candidates).

mkesper9y ago

Before databases you had only flat files or VSAM (and before that, ISAM) for your data. Good luck debugging your data in a VSAM cluster. https://en.wikipedia.org/wiki/VSAM

2 more replies

1ris9y ago

This is not true for any non-trivial database. I had to learn this. Business logic still has to be understood, there is still lot's of domain knowlege how this is indented to be used and how not, and in my case there was still lot's of ambigous, poorly documented VBA-Code to be understood. I'm not aware of a database without a frontend of some kind.

collyw9y ago

Personally I find SQL to be one of the easiest languages to read (when reading code from other developers).

davnicwil9y ago· 17 in thread

Just for fun: does anyone have any stories about initially using a SQL database for a project, later hitting problems, then switching to (or augmenting with) a NoSQL solution that solved those problems?

sergiosgc9y ago

I've reached a point where my default design is relational database augmented with memcache/redis. The relational DB is the authoritative source of truth, redis absorbs all (most) reads and memcache is, well a distributed cache. Writes go to both redis and relational. Redis data can always be rebuilt, and is regularly rebuilt, from relational data.

At project onset, usually only redis or memcache are in use.

It's a model that has served me well, apart from bugs that cause divergence between redis and relational. These are usually the class of bugs that would corrupt data were I relying on redis alone, so I just think of them as a lesser evil.

There's an obvious bottleneck on writes to the relational DB but I've so far managed to keep from hitting that ceiling.

scarface749y ago

I'm really becoming a big fan of CQRS/event sourcing where we have two read models - one NoSql for live data and one relational for BI/Reporting. Events are stored separately in JSON because that's how they come in anyway through the API.

1 more reply

Kiro9y ago

Kind of. I started building my game using SQL but realized everything became so much easier if I didn't have to follow a schema. For example I like to add arbitrary properties to the player objects all the time. With MongoDB it's a no-brainer. I just add the property and save.

elmigranto9y ago

> no schema

Do you read them back? Does your code expect property "X" to be on a "Player" object? Do you put a check everywhere for when it doesn't? Do you have default value for objects that were created before new property was added?

All that work could've been done by your database (which is probably has some kind of JSON implementation anyways), but somehow it's a "no-brainer" to implement data consistency at app-level. Okay.

2 more replies

72deluxe9y ago

Couldn't you have designed your schema to have a playerProperties table with a playerId column, a key column and a value column?

That'd be forever extensible and if you made the primary key a combination of playerId and key, you'd offload duplicate handling to the database.

Seems simple to me.

1 more reply

pinouchon9y ago

Most of the stories I have heard are from people choosing a noSQL store for the wrong reasons. They end up going back to a relational DB because now they see the value of ACID. The typical case is Mongo => Postgres/Redshift.

fs1119y ago

I think most people in that group start out with a normalized schema in a SQL db and end up in a highly denormalized or even key/value style setup in said DB. Technically it is still a SQL database, but not used in that way.

jerven9y ago

public UniProt.org went from SQL normalized -> SQL primary key/blob to proper K/V (BDBje) to custom K/V in the space of ten years.

However, data production is still a mix of batch and sql systems.

zzzcpan9y ago

Switching a project from SQL to NoSQL is usually too expensive. Traditional relational databases lack a lot of necessary constraints to even have architectures necessary to make the switch to a distributed database, it can only happen with a complete redesign of everything. I think it's more common for those who had problems with SQL databases to drop them for their next projects and move on. For example, I consider a highly-available multi-site distributed database as a minimum viable database and haven't used an RDBMS in like a decade.

33degrees9y ago

If you consider ElasticSearch a NoSQL solution, then yes, many stories. A properly normalized database can be very slow to search, and throwing ElasticSearch in front of it is a very common solution.

prashnts9y ago

We used to use Postgres for almost everything, including full-text search.

Later when we needed a bit more control over search, we started using Elasticsearch. The way we do it now is to perform joins and get a nice serialized object with all the possible search/filter/sort fields, and put it in ES. All the data still remains in postgres, though.

danso9y ago

I had assumed Facebook started off with MySQL and now is NoSQL but that was just ignorant of me:

Zuckerberg may have had to rely on a typical LAMP stack but FB's newer user-facing features still use SQL. For example, the Timeline feature released in 2012 was built on MySQL/InnoDB: https://www.facebook.com/note.php?note_id=10150468255628920

And last year Facebook released MyRocks, which is a space/write optimized replacement for InnoDB, and is being used for their "user database tier" https://code.facebook.com/posts/190251048047090/myrocks-a-sp...

I guess the things I've read about FB using NoSQL [1] is for other parts of their infrastructure, particularly Messages. It sounds like they had considered using MySQL, even though Cassandra was built for the Inbox feature. They ended up building a new system for Messages [2].

Now that I've spent a little time in that rabbit hole, it looks like one answer to your question is Facebook's Inbox Search, which used MySQL originally to store inbox data (7TB for over 100M users, which seems laughably small with respect to Facebook's scale today):

http://docs.datastax.com/en/articles/cassandra/cassandrathen...

> Before launching the Inbox Search application we had to index 7TB of inbox data for over 100M users, then stored in our MySQL[1] infrastructure, and load it into the Cassandra system. The whole process involved running Map/Reduce[7] jobs against the MySQL data files, indexing them and then storing the reverse-index in Cassandra. The M/R process actually behaves as the client of Cassandra. We exposed some background channels for the M/R process to aggregate the reverse index per user and send over the serialized data over to the Cassandra instance, to avoid the serialization/deserialization overhead. This way the Cassandra instance is only bottlenecked by network bandwidth.

The acknowledgments section makes it more clear that the MySQL data was indeed migrated over to Cassandra:

> Cassandra system has benefited greatly from feedback from many individuals within Facebook. In addition we thank Karthik Ranganathan who indexed all the existing data in MySQL and moved it into Cassandra for our first production deployment.

[1] https://news.ycombinator.com/item?id=7891316

[2] https://www.facebook.com/notes/facebook-engineering/the-unde...

johnchristopher9y ago

I suppose they'd look first into MySQL's new JSON support. (Or other SQL database that have some kind of JSON support)

Klathmon9y ago

PostgreSQL has fantastic JSON support and i've used it a few times.

Not only is it fairly fast, but it lets us tie JSON blobs that really don't need schema enforcement to be still tagged with the rest of the data and laid out relationally to still allow complex queries.

kalleboo9y ago

Does using memcache for caching count as "augumenting with a NoSQL solution"? Because that's pretty popular...

FLUX-YOU9y ago

Logging

lr4444lr9y ago

Yeah, I use MongoDB whenever my tables get too big. It's web scale! </s>

jackfoxy9y ago· 14 in thread

There just is no substitute for SQL. Some thoughts on what has given it a bad name:

1) The pervasive use of artificial keys. USE NATURAL KEYS. Unfortunately probably 99% of real-world databases were designed with artificial keys. I wish I could point to some literature on this topic. It is very rare and I only came to learn about this from a DBA who is well-versed in designing with natural keys. I'm trying to get him to publish more on this topic.

2) ORMs. This is just a bad practice. Their use in part derives from the awful schemas designed with artificial keys, requiring another layer of complexity to get a more intuitive model of the data. Fortunately for me over the past 3 years I've been doing almost all my application I/O with F#'s SQL Type Provider, SqlClient, http://fsprojects.github.io/FSharp.Data.SqlClient/ , which strongly types native query results, functions, and SPROCS. Just does not work if you need to construct dynamic SQL. I've been trying to goad the author into also providing meta data retrieval. That would be the icing on the cake.

3) SQL does not seem to be a required topic for undergrads. There are really no unsolved problems (of note), so it's not interesting to academics.

4) Most app programmers don't get much practice writing difficult queries or tuning problem queries, so that one time every 9 months when you do something hard, it is hard. (And again, often compounded by the complexity introduced by artificial keys.)

scriptkiddy9y ago

> The pervasive use of artificial keys. USE NATURAL KEYS. Unfortunately probably 99% of real-world databases were designed with artificial keys. I wish I could point to some literature on this topic. It is very rare and I only came to learn about this from a DBA who is well-versed in designing with natural keys. I'm trying to get him to publish more on this topic.

While I really enjoy the concept of Natural Keys, I just see so few places where they are applicable. If I was designing a banking Db schema, I could see using an account number as a primary key as long as I am not exposing my Db to anything but internal systems. However, if I'm designing a social networking platform, what can I use as a natural primary key for a user object? I can't use a name because not all names are unique and they can change. I can't use an email because they can change and I feel like Email addresses can get pretty large(more space and slower lookups). I could maybe use a username if I enforce that a username can never change and must be unique. But, we then run into the same issues as emails where a username can be fairly long and thus cause slower lookups. I also don't like the idea of using strings as primary keys either because I would need to take into account implementation details like string encoding (utf8, utf16, utf32, ascii, latin-1) and make sure to encode/decode on every lookup/insert.

So, I can see some use cases where natural primary keys make sense. However, I believe that for most use cases, artificial keys are a better choice. Integers don't take up a lot of space, it's easy to enforce uniqueness, they are a natural sequence, and they rarely, if ever need to change[1].

[1] In fact, I would make the argument that if you're altering primary keys at all, you're doing something wrong.

jackfoxy9y ago

Properly identifying the natural keys requires more up front thinking than slapping on an identity or guid column. Also it ends up with a different set of tables at the end of the day. So looking at your current schema and saying natural keys don't work here is probably true. It's a big topic, and like I said, not enough literature, but if you search sql natural key you can get started.

1 more reply

gnaritas9y ago

1) Natural keys suck in the real world, artificial keys have come to dominate for a real reason that you either understand or suffer for ignoring. Natural keys are bad practice.

2) ORM's are good practice, they reduce massive amounts of duplicated code in applications and massively lessen development time which generally dwarfs the cost of execution time. If you don't use an ORM you are pissing money down the drain, most businesses are reluctant to do that. Using an ORM doesn't mean not using SQL when appropriate, it means using the ORM in the 90% of cases where it works best. Complex SQL belongs in views, which are then trivially mapped with the ORM giving you the best of both worlds. If you're doing complex queries with the ORM, the ORM isn't the problem, you are.

3) Should be.

4) Nor should they, programmers are expensive, paying them to do what a library can do better is a waste of money and time. You need a guy or two who knows SQL well enough to tune query and do indexing, that's your db guy.

pjungwir9y ago

> There are really no unsolved problems (of note), so it's not interesting to academics.

For anyone who likes SQL and is looking for a research topic, read about temporal databases and suggest some ways to handle DDL changes over time (e.g. adding a new NOT NULL column, or changing a relationship from one-to-many to many-to-many).

Here is your starting bibliography:

Richard Snodgrass, Developing Time-Oriented Database Applications in SQL.

Hugh Darwen & C.J. Date, "An Overview and Analysis of Proposals Based on the TSQL2 Approach".

Krishna Kulkarni & Jan-Eike Michels, "Temporal features in SQL:2011".

Tom Johnston, Bitemporal Data

Magnus Hagander, "A TARDIS for Your ORM": https://www.youtube.com/watch?v=TRgni5q0YM8

I would love to read what you come up with!

nickpeterson9y ago

I would argue that [http://www.AnchorModeling.com] is a pretty good take on it. I think it shares some genealogy with Fact-Based Modeling techniques (FCO-IM?). It has a really decent story for evolving the schema over time.

I've read Darwen/Date, Snodgrass, and Johnston's (two books) on the subject. Johnston's seems the best practical choice, but they patented the ideas :/ That said, one could probably implement Johnston's model without violating patents if you ignore explicitly modeling the episode structure and just follow the theory.

Date and Darwen make decent points about wanting to use the relational model rather than some baked-in concept, but ultimately they do almost no legwork on practically putting any of their notions to use. This seems to be par for the course on Date books.

Weis and Johnston handle the problem more directly. They also tackle a harder problem overall (BiTemporal, vs just Temporal). Also Johnston is just easier to read than Date. (Side note, I feel like a blog where I just read chapters of Date writings and condense the content basic points would end in most of Date's arguments fitting in 50 pages.

That said, Weis and Johnston still punted on schema evolution. Anchor Modeling sort of starts at supporting an evolving schema and moves outward from there. In Anchor modeling there isn't the traditional Temporal/Bi-temporal notion, but rather positors than have varying degrees of certainty about posits. There is basically a concurrence of facts that can be retracted or changed over time. Useful for modeling varying degrees of certainty or alternate perspectives on data.

The downside of Anchor Modeling is that the datamodel is basically 6th normal form with a bunch of table valued functions, triggers, and views to aid in making it palatable for devs. Johnston sort of acknowledges this style in one of his books but I believe argues against the concept of an entities information being spread throughout various tables. In his mind, a table is a type, that is made of attributes (columns). Anchor modeling is more along the lines of 'Attributes are types', and you can relate them to form larger types.

I wish there was more work on the ORM side supporting some of these concepts (Schema evolution and temporality).

jackfoxy9y ago

Yes, I would agree, temporal databases is an interesting topic.

cheez9y ago

Natural keys can be composite which makes it a pain to manage cross-table relationships. I typically have a unique constraint on the actual composite key and a auto inc id.

gnaritas9y ago

That is the correct solution, a surrogate key.

1 more reply

sobani9y ago

Can you give 1 example of a good natural key?

Note I will disqualify anything that has a reasonable chance of changing, like the primary email address of an account, a persons name, a persons day of birth or the 'public id' of a bank account

yawaramin9y ago

A composite key made up of foreign keys in a junction table.

irishsultan9y ago

How often does the birthday of people change? (Then again, how likely is it to be unique?)

1 more reply

ASipos9y ago

I studied SQL as the lab portion of a database design course. The sequel to that course was wholly dedicated to PL/SQL.

neuromantik80869y ago

And the sequel to that was postgrad?

combatentropy9y ago

The sequel to the SQL? ;)

1 more reply

js89y ago· 9 in thread

SQL is good, but it shows its age. Today somebody should come up with something statically typed and more functional (meaning using lambda calculus as a starting point).

The biggest pain points of SQL (IMHO) are:

- lack of statically typing guarantees (for example, no guarantee that a certain table has certain column)

- bad capability to abstract over parts of the data model (for example, queries have to specify the table that they query)

Both of these can be resolved with use of good enough functional language. There are projects like that in the FP/Haskell community (e.g. Ermine), but it's fragmented.

laumars9y ago

You might need to explain yourself a little more because I'm confused by your examples:

> "- lack of statically typing guarantees (for example, no guarantee that a certain table has certain column)"

I would have described SQL as statically typed because columns are given a type when created. The issue you described seems more akin to you wanting auto-complete features for SQL (which do exist in some DB management tools) because while you can specify in code a column that doesn't exist, it will fail to compile on the RDBMS in much the same way that you can still write code that fails to compile in imperative and functional languages.

> "- bad capability to abstract over parts of the data model (for example, queries have to specify the table that they query)"

You don't need to specify the table name for each column if you're only querying one table. The issue arises with multiple tables. I see this a bit like having to specify the namespace in Go packages or class name in Java where you don't want to import every property and method into the global namespace. I'm not sure what the functional solution you envisage would be but there are workarounds in imperative languages such as the `using namespace` declaration in C++ and the `with` block in VB (I can't believe I just referenced VB in a serious discussion!). Generally though I've found the pain of referencing table names in SQL to be somewhat mitigated through aliasing them via their acronyms. eg

    select * from people p, borrowed_books bb where p.name = "laumars"

SQL definitely isn't a pretty language though so I am very interested in your thoughts for how a more intuitive, functional-inspired, query language might read.

js89y ago

I think marcosdumay already explained it pretty well.

> I would have described SQL as statically typed because columns are given a type when created.

Both statically and dynamically is a misnomer because it's akin to something more in-between - like a dynamic language which can create, say, an empty list and declare it can only contain integers.

Just quickly though - I believe static typing helps productivity, because you don't have wait for some edge case in your program (in this case infrequently invoked query, since SQL is often embedded in other language) to fail. There are other benefits, like Quickcheck.

I think better term (which would explain the distinction between schema creation and query compilation) would be static typechecking of queries against the schema.

> The issue arises with multiple tables

In general, SQL is quite adhoc in what it lets you do, and anything more complicated you need to resolve with code generation (for example running same query over different tables). It's very hard to reuse SQL code. That's ugly, and functional languages offer a better solution - high-order functions. I think what I really want is to deal with schema (and other things, such as result set) as first-class types.

I think the language should be functional, but also total (every function will finish, no recursion allowed), to make it easy for compilers (query engines).

1 more reply

marcosdumay9y ago

> I would have described SQL as statically typed because columns are given a type when created.

The biggest problem is that SQL has no well defined "compile time". For all practical purposes, it's always an interpreted language. I can't check my code against the production database without running it.

> You don't need to specify the table name for each column if you're only querying one table.

I do think the GP was about doing stuff like that:

    SELECT * FROM function_that_returns_table();

2 more replies

elmigranto9y ago

> statically typing guarantees

int will be an int. You can't store string there. What else do you need? Something along the lines of SQL's `check` or Postgres's domains?

> queries have to specify the table that they query

How do you imagine not doing that, something like "pull some things and do stuff with it"? I think we are quite far from that kind of reality.

> certain table has certain column

I don't get it. It either has, or it doesn't. In one case you get a value, in other one — SQL error. If that's so vital, pull up some schema information and check before running an app.

deathanatos9y ago

> What else do you need?

Me personally: sum types. (Some languages call these tagged unions. Rust calls it an enum, but note that enum here does not mean "an integer under the hood")

We have the case where we have an entity who, as part of its primary key, has a value that is either a valid integer or a sort of "Empty" value. It's part of the key, so I can't use a nullable integer to describe this column, as doing so would prevent me guaranteeing uniqueness (the "empty" value is only valid once, unlike a NULL in a unique constraint). I can use (bool, int) column pair and some check constraints, but it leaves the integer exposed to poorly formed queries, such as SUM(the_integer_part). If I can dedicate a "special value" in the integer, I can use just the integer; that's similarly brittle. (SUM — and most other arithmetic — is valid, but ONLY if the column isn't the "Empty" value.) It'd be nice to be able to model the table as something like,

  CREATE TABLE (
    ...
    count_or_empty union { int | empty } NOT NULL,
    ...
    PRIMARY KEY (..., count_or_empty, ...)
  );

If the DB supported sum types, the type of count_or_empty here could deliberately be a union, which would not be compatible with SUM. You'd need to "unwrap" the union prior to doing such operations on it (check if it's Empty or not) and then do the appropriate thing: trying to blindly SUM on it would be a static error.

2 more replies

techno_modus9y ago

> SQL is good, but it shows its age

"The relational model is dead, SQL is dead, and I don’t feel so good myself". [PDF] - https://sigmodrecord.org/publications/sigmodRecord/1306/pdfs...

> Both of these can be resolved with used of good enough functional language

Yes, functional aspects is exactly what is missing in RM in general and SQL in particular. Yet, I do not think this can be easily added - the whole paradigm has to be changed.

marcosdumay9y ago

SQL is a declarative reactive language. There's something to gain by adding functions as first-class objects, but I don't think rethinking it from functional foundations would help. Lambda calculus is focused on data processing, and a query language must be focused on data searching, those are completely different applications.

But I do completely agree with the pain points you pointed there. I'd go further on the first point, client software should have an easy way to check its guarantees at connection time too.

js89y ago

> There's something to gain by adding functions as first-class objects, but I don't think rethinking it from functional foundations would help.

So what other approach to designing a declarative language do you want to take?

I propose total functional language because it's extremely simple and systematic approach. You add your primitive types, primitive functions, primitive type operators (like algebraic types) and voila - you are pretty much set. That way, you get lot of reusability which would be hard to do in an adhoc designed declarative language. And a very uniform syntax as a nice bonus.

My question is not rhetorical. Building declarative languages based on typed lambda calculus is very well understood.

metaphorm9y ago

can you elaborate? you haven't really explained these ideas in a way that helps me understand them but it sounds interesting.

what do you mean by static typing guarantees? tables have a schema already (static type declarations). table schemas are essentially all objects of the same type though (a collection of column types, and the possible values of those types are enumerated). are you referring to a meta-data level of type checking? like "schema of type Foo always contains a column named FOO with type varchar" or something like that?

what do you mean by "queries have to specify the table that they query"? do you mean that there ought to be templated or generic queries? I can't see a way out of specifying which data sets you want to query on though. I'm hoping you can use more detailed language to clarify your intent here.

plet9y ago· 9 in thread

SQL is really good at projecting & selecting simple data. My rule of thumb has been to use it first for any per projects and but as soon I need more than one table, think deeper about the data and switch to NoSQL if I need to represent complex data structures or have document storage needs.

Its still amazing how far you can go with a single table and few tweaks to a postgres instance.

collyw9y ago

Sounds like you need to learn how to design a database. Seriously, do you find more than 1 table complex?

plet9y ago

That's odd. I don't recall saying more than 1 table is complex. For me, it's just a good time to think more about what kind of database to use, esp for pet projects.

1 more reply

noja9y ago

As soon as you have more than one table you switch to NoSQL?

Did you meant to write it that way round?

plet9y ago

> As soon as you have more than one table you switch to NoSQL?

Nope. Just start thinking a bit more if I need to continue using RDMS or switch to NoSQL now. I can when the time is right for the project. It varies for projects. While learning phoenix (elixir) I stuck with postgres because the tutorials were easier to follow. While creating a fancy blog engine I switched to Mongo

danso9y ago

That's funny, the main reason I cite for going from spreadsheets -- which are powerful enough with pivot tables and VLOOKUP -- to SQL is the ability to use JOIN statements in the latter, i.e. when you need to work with data from multiple tables. Most production SQL databases involve multiple tables, not as a legacy hack but by deliberate design.

alunchbox9y ago

Haven't seen a reason to use any type of NoSQL aside from a cache layer. Most Modern RDBMS have Json support if you still want to use a document approach for specific cases but overall Postgres and SQL Server are able to perform equally if not better then most recent NoSQL implementations (unless it's a really specific one off use case for reading)

just look at the nightmare MongoDB has created in most startups a year later.

mysterydip9y ago

The SQLite site says it better than I could: https://sqlite.org/whentouse.html

Personally I use NoSQL options for the "replacement of ad-hoc disk files" they mention on that page. Like many of the comments here, anything more advanced than that and I'm using a relational database.

scarface749y ago

I fell in love with Mongo when I first started using it 6 months ago - the whole NoSql thing appealed to me. Then I fell in love with Sql Server 2016's JSON support - the best of both worlds.

plet9y ago

For me, NoSQL works great when the structure of the data is unclear but you have a fixed identifier that you can key off. SQL when the structure is known and juggling multi-table transactions is not a big PITA.

I'm not sure about the mongoDB nightmare, for me its done everything as the documentation claims it'll do.

2 more replies

chrisan9y ago· 7 in thread

Most(?) of the articles I read on HN, and then their comments, always seem to put NoSQL in a bad light when people use it for things that "should" be in SQL

What _are_ the "correct" use cases for NoSQL? Everything has always been relational data for me

metaphorm9y ago

> What _are_ the "correct" use cases for NoSQL?

there are many and the different use cases for it that I know of aren't really connected by a theme or any criterion that can be generalized. this is my own observations only, and not a statement based on theory. the wisdom of experience. take it for what it's worth.

1. application state persistence. persisting the app state in something closer to its native representation (JSON, for example) is convenient sometimes. if all you really want to do is save the state and then reload it later in another session you've got a strong use case for some kind of NoSQL. Note that application state persistence is very different than object state persistence. The app state is specific to a single user at a single point in time. Object state might have multiple users (anyone that can manipulate that object) and multiple points in time (e.g. objects that are persisted for long periods of time).

2. Key:Value store for cacheing. this is the most basic, canonical use case. Memcached, redis, etc.

3. data structures that SQL is poorly suited for. Graphs and Trees come to mind. if you find your SQL schema suddenly becoming bogged down in M2M through tables, you probably have a good use case for a Graph oriented datastore.

4. Time series. arguably this is just a subset of relational data, or a domain specific language extension over traditional SQL, but still, it feels a little different in practice.

dragonwriter9y ago

> What _are_ the "correct" use cases for NoSQL?

Depends on which noSQL. GraphDBs make sense when your application needs efficient graph queries. Eventually-consistent, distributed stores make sense when your application needs very high availability and can tolerate eventual consistency.

NoSQL can also be justified on non-technical grounds as a temporary workaround for poor administrative policy regarding SQL databases, it most likely once the DBA group realizes that you've built a critical non-SQL datastore, the policies will be extended to them as well if the basic problem hasn't been corrected.

BoorishBears9y ago

We designed an analytics framework that wouldn't gain much from SQL. Our analytics events are arbitrary JSON (with a few required keys like time). The SQL version of that would be to have a column for each required key, and putting all the non-required JSON in a column, and processing that column as JSON/JSONB. Instead we used Elasticsearch and got an efficient layout of our data for queries, and got visualizations for "free" (Kibana)

marcosdumay9y ago

You can use a NoSQL database when your registers have no relation with each other (not even in different tables).

Still, it's not automatically a good choice just because the above applies. It's guaranteed only to not be an horrible choice.

The real answer is that nearly nobody has applications where NoSQL is a better choice than SQL databases. And those few that do will know it very well, so they don't need to ask around on internet forums.

jalayir9y ago

> What _are_ the "correct" use cases for NoSQL?

When you need a highly available and reliable DB for your application, then you need a cluster approach for data replication. Most popular SQL DBs are single-node, with some application-level clustering solutions, so the only option is to scale vertically. However, a lot of no-SQL DBs like Redis and DynamoDB do clustering/replication closer to the iron.

shakna9y ago

Something I made that was simpler as NoSQL: a comment system.

Specifically a page had a list of comments, and a comment was a date/time and some text.

There was no cross-referencing or nesting, just a basic ordered list.

There is a relationship there, but it's a simple 1:1 relationship.

ubernostrum9y ago

Depends what you mean by "NoSQL".

I've gotten plenty of mileage out of using key-value stores to complement relational databases. They're good at acting as task queues and caches, for example.

danso9y ago· 6 in thread

I've already posted on HN how I use SQL in my public affairs data journalism class [0]. To me, there is no better, in terms of accessibility and return on investment gateway language to the power of computation and programming than SQL, with the exception of spreadsheets and formulas. Even if you don't go further into programming, SQL provides the best way for describing what we always need to do with data for journalistic purposes -- joining records, filtering, sorting, and aggregating. Ironically, I learned SQL late in my programming career, and initially thought its declarative paradigm to be mysterious and inferior to procedural languages. In fact, I don't know how to do anything in SQL beyond declarative SELECT queries (and a handful of database create/update admin queries). Turns out this is just powerful enough for me for most app dev work (Rails, Django), and the simplicity is a boon for non-programmers.

ProPublica just published a bunch of data-related jobs and positions. The phrase "Proficiency in SQL is a must" makes an appearance: https://www.propublica.org/atpropublica/item/propublica-is-h...

[0] https://news.ycombinator.com/item?id=8505000 https://news.ycombinator.com/item?id=10585009

wodenokoto9y ago

I've recently started learning Pandas (dataframe library for python) and i find most queries cumbersome compared to the absolute minimal SQL I know

lowmagnet9y ago

Pandas is mostly columnar in nature. If you're heavy into Python idioms it's fairly easy to grok, imo, but there is a subtle learning curve about what is in and out of kernel. Since in-kernel operations are about 50-100x faster, it's obvious when you did the wrong thing, but it doesn't show until data sets are huge.

I tried doing something regarding browser hits from Akamai data in Pandas and 3 different SQL databases (mysql, postgres, sqlite) and nothing came close to pandas for holding 150m hits (one day's worth across our properties) in memory as well as Pandas. Especially with Dask Dataframes mixed in. No competition for the effort involved.

pweissbrod9y ago

Using tools such as sqlmagic or pyodbc for ipython (below) you can let the sql database perform most of the heavy lifting of getting "raw" result sets into a dataframe and then perform some lighter, possibly more ptyhon-idiomatic tweaks to the data once the focused set is loaded in-memory.

  def getdataframe(sql):
     try:
          connection = pyodbc.connect("DSN=myDSN", autocommit=True)
          return pandas.read_sql(sql, connection)
      except Exception as e:
          print repr(e)
      finally:
          connection.close()

arkh9y ago

> with the exception of spreadsheets and formulas

You may want to check Oracles MODEL clause. Not as powerful as a spreadsheet but you can already do a lot of things in the middle of your SQL.

danso9y ago

Thanks for the tip, hadn't heard about it before but any SQL feature that inspires articles like "Using the SQL MODEL Clause to Generate Financial Statements" [0] is generally the mix of practicality and code that I'm looking for. I suppose one major roadblock is the use of Oracle SQL. Not the language but the database software. I used to teach MySQL and SQLite before settling on solely the latter. First, SQLite can easily handle any database that students have to deal with, which at max are in the gigabyte range (and usually much much smaller than that). Second, there's far fewer moving parts. I mean, they still have to deal with the idea that SQLite is a database, and that something like DB Browser for SQLite [1] is a client, but it's much less of an obtrusive detail than installing MySQL, a MySQL GUI, and then checking to see if the MySQL daemon is running on their personal laptops.

[0] http://www.orafaq.com/node/69

[1] http://sqlitebrowser.org/

kwillets9y ago

I support a lot of data scientists using SQL and map-reduce technologies. The ones using SQL are about 10x more productive. The ones on map-reduce are building tools to figure out why their metrics are bad, because it takes too long to re-run them.

cm21879y ago· 5 in thread

I think most developpers using SQL use it a bit like most office users using VBA (ie they know how to record VBA macros but not much more). Most developers know how to write basic queries so have a very superficial understanding but most likely know very little about the performance implications, structured indexes, nesting of queries, etc. Whereas I would expect developpers who claim to use C to have more than a superficial understanding of the language.

maxxxxx9y ago

That describes me. I think it would be good if SQL databases had more accessible tools. The experience is just totally different from other programming languages. When I see a complex query I can't read it and don't understand what its implications could be. It reminds me a litte of regular expressions. If you don't use them all the time you forget the syntax and have to look it up every time you need them.

Either a more modern looking language or some easy tools to analyze and build queries would help a lot in my view. Also refactoring tools that can analyze the impact of table changes in all stored procedures would be nice.

72deluxe9y ago

But the same is true of any other language. Given a complex piece of C++, no amount of IDE features is going to help anyone understand it, nor know what is implications could be in a system (without looking at the entire source code to see where it is used).

What you are lacking is:

a) knowledge of your chosen database system in sufficient detail to know the trade-offs and pros/cons, and

b) sufficient knowledge of SQL specific to your chosen database system in order to get the most out of it.

This is a bit like knowing C++ in sufficient detail to be able to cope with a complex bit of code that you see, and also knowing your compiler sufficiently well so that you know it has foibles/bugs in certain areas.

What you need is more knowledge, not more tools. Even if I have a hammer and a chainsaw, both are useless to me if I don't know enough about wood to build what I want.

1 more reply

antisthenes9y ago

> The experience is just totally different from other programming languages. When I see a complex query I can't read it and don't understand what its implications could be. It reminds me a litte of regular expressions. If you don't use them all the time you forget the syntax and have to look it up every time you need them.

The experience you describe is exactly like any other programming language. If I look at a Haskell program, it will look like gibberish to me, because I don't know the syntax.

majewsky9y ago

> It reminds me a litte of regular expressions. If you don't use them all the time you forget the syntax and have to look it up every time you need them.

Another common ground between regexes and SQL is that you should always have the manual of the tool in question open when you write a regex / SQL query because their syntaxes are all subtly different. For example for regexes:

  Perl:   /^(?:aa|bb\b)+/
  vim:    /^\%(aa\|bb\>\)\+/

tofflos9y ago

There are definitely some complex statements out there that require considerable effort to understand. But those statements are vastly outnumbered by relatively simple ones that are unreadable because the author did not follow best practices in writing readable and maintainable SQL. :(

Having said that I agree with you that there is plenty of room for improvement in the SQL-language itself.

cousin_it9y ago· 4 in thread

One of my dreams is building a hybrid of RDBMS and Protocol Buffers. It would look like a bunch of nested structs that can be kept in memory, fetched from disk or over the network. The schema would be kept in .proto files, and you'd be able to reload it at runtime (existing data would be handled according to protobuf schema evolution, because each field in each struct is numbered and old numbers aren't reused). Most joins would be gone because nested structs are allowed, but you'd still have an SQL-like language for more general queries (designing such a language when nesting is allowed is a bit tricky, but not very). Things like indices and transactions would also exist at every level (memory, disk, network) because they are useful at every level.

The end goal is eliminating impedance mismatch, because your current in-memory state, your RPCs and your big database would be described by the same schema language, somewhat strongly typed but still allowing for evolution. I have no idea if something like that already exists, though.

weavie9y ago

The problem is that what joins to what differs depending on the context for what you are looking for.

Sometimes you join an order to supplier, other times it will be to customer and other times it will be to a list of products. Sometimes, all of them need to be included.

With nested structs the parent/child relationship is fixed. If your query needs to invert this relationship, you essentially have to search through your entire database which will be incredibly slow and resource intensive.

Either that, or you just store your data and their relationships separately and then allow for highly optimised searches to be conducted between these relationships according to the query and statistics you hold about the data ... but then you have just reimplemented an RDBMS..

cousin_it9y ago

I guess the idea is to make the schema as readable as possible using nesting, and then use indices for the rest.

TimJYoung9y ago

Yep, it was one of the things used in quite a few systems before SQL became more widespread:

https://en.wikipedia.org/wiki/MultiValue

jimktrains29y ago

> One of my dreams is building a hybrid of RDBMS and Protocol Buffers.

That's actually one my goal with https://github.com/jimktrains/drsdb. All communication would be done via protocol buffers and schemas could be created or loaded at runtime to be verified and used.

The primary goal begin a distributed RDBMS written in rust.

zephyrfalcon9y ago· 4 in thread

"Why do we still use SQL" and "Why do we still use relational databases" are two very different questions. They seem like much the same thing, because SQL is pretty much the only query language offered by relational database systems nowadays... so if you use SQL, you use an RDBMS, and vice versa. But other query languages used to exist. There was QUEL [1], for example. It seems to have fallen by the wayside; most people have probably never heard of it. I guess there is very little room for multiple languages in this particular space.

[1] https://en.wikipedia.org/wiki/QUEL_query_languages

default-kramer9y ago

Absolutely. Relational databases are so useful that I happily use SQL even though it is not a very good language. I have thought about making a "compile-to-SQL" language in the same vein as Typescript. (Does this already exist? You would think so, but I can't find one.)

HTSQL has some great examples of where SQL could be better: http://htsql.org/doc/overview.html#why-not-sql. I would love to get that goodness in a language not tied to the rest of HTSQL.

Zak9y ago

There have been a few languages that compile to SQL. CLSQL for Common Lisp and Korma for Clojure come to mind.

jmcqk69y ago

There are quite a few things that "Compile-to-sql". In .NET, the EntityFramework ORM takes the AST from the C# code int he query and generates the SQL from it. I'm not sure how ORMs work in other languages, but I would imagine it's something similar.

marcosdumay9y ago

SQL is a great language. So great that it's a superset of everything good ever created on this domain¹, but still cognitively simple.

There is space for statistical and search-based (like Prolog) languages, but those are very niche. Proof of that is that they exist, but yet nobody here is talking about them.

1 - Ok, object store query languages are not a strict subset of standard SQL. But it's only a matter of adding one or two commands, like Postgres does.

baldfat9y ago· 3 in thread

SQL is the second most underused tool we have with dealing with data. AWK being the first. SQL is great because the logic works with dealing with data and forces you to make good decisions earlier.

Everyone should spend a day learning SQL if for no other reason they the ability to think logically about data.

majewsky9y ago

awk is really nice. I'm starting to use it more and more in places where I previously used a long pipe of grep, cut and sed.

GuB-429y ago

I use perl when grep/cut/sed show their limits. I never really got into awk.

I suppose we all have our favorite tools.

1 more reply

tkyjonathan9y ago

Amen.

vinceguidry9y ago· 3 in thread

I came expecting a treatment on Structured Query Language, was disappointed when it turned out to be on Relational Database Management Systems. It doesn't take a rocket scientist to figure out why RDBMSes won out over non-relational systems.

What I want to know is why nobody ever came up with a better query interface. Every abstraction I've ever seen was built on top of SQL.

jrochkind19y ago

Because SQL is so good at representing operations in the relational algebra that rdbms are based on, and such a mature technology, that it's awfully hard to replace it with something better? SQL and rdbms kind of go together, and always have -- it's the query language that parsimoniously represents what you can do to a relational source, whose development has happened in concert with rdbms themselves.

What don't you like about SQL?

wolfgang429y ago

Not OP, but I find the syntax to be arcane and bizzarely restrictive. Every query has to be SELECT FROM JOIN WHERE GROUP BY HAVING ORDER BY; for some reason the table comes after its fields and the ORDER BY can't be expressed using field aliases defined at the start of the query.

Instead, I've been toying with the idea of a language where queries are expressed as a pipeline of operations on result sets, like this:

    FROM employee
      /* result set: all employees */
    LEFT JOIN department ON employee.DepartmentID = department.DepartmentID
      /* result set: all employees + their department */
    ORDER BY employee.last
      /* result set: all employees + their department, sorted */
    WHERE department.name IN ('Sales', 'Engineering')
      /* result set: Sales and Engineering employees + their department, sorted */
    FIELDS employee.first, employee.last, department.name
      /* result set: Sales and Engineering employees' first, last, and department names, sorted */

This is a SELECT, but you can for example turn it into an update just by adding a clause to the end:

    UPDATE employee.salary = employee.salary * 1.05

This is a much more regular language; there's nothing enforcing you to write these operations in a certain order, and you can add or remove clauses as necessary. For complex queries, I find that I write in this style anyway, using a chain of WITH clauses as a pipeline with a final SELECT on the end to get the results.

4 more replies

spacemanmatt9y ago

Thing is, SQL maps fairly directly to the fundamentals operations defined on Relations in their rigorous sense. Every abstraction of those operations is going to have a similar form to SQL unless it seriously over-complicates things.

skc9y ago· 2 in thread

I've always thought the main reason NoSQL solutions became popular is that developers could finally get at and manipulate the data the way they wanted to.

I've known some very, very prickly DBA's in my time who referred to the databases they looked after as "My database". So they would say things like "Don't put junk in my database"

And would give you endless grief over how you wrote your queries or asked you a million questions about why you needed a new table and why your proposed design was shit.

As a result, many of us devs tended to view SQL Databases as some sort of dark art. So in this regard, NoSQL is freeing at first glance.

But if I'm honest, once I got over my fear, the "pros" of NoSQL solutions in comparison to good ol' SQL seem to be relatively feeble.

I think it's easier to get up and running with a NoSQL solution because there is far less friction when it comes to rapid prototyping of ideas, but things get complex pretty quickly.

I'd also say that for the vast majority of applications out there, the difference between the two will mostly be a wash.

hackits9y ago

By any chance did these DBA's be the Linux server administrators at all?

My own personal experience is DBA's/Linux server administrators have authority issues. I had one situation where I had a encrypted file on one of the Linux server and the server administrator requested for months to have the decryption key so he could inspect the file.

duozerk9y ago

That seems highly unprofessional. I won't even ls into users' directories without their express permission.

1 more reply

menzoic9y ago· 2 in thread

Is it fair to compare the age of SQL to the age of JavaScript or that SQL has survive rapid change? SQL is a class of languages, while Javascipt is a specific language. The concepts behind JavaScript are older than SQL, and the modern SQL languages we use today are younger than 43yrs.

majewsky9y ago

> the modern SQL languages we use today are younger than 43yrs.

99% of people use a miniscule subset of SQL, most of which easily dates back decades.

Twisell9y ago

But so does JavaScript it's not like it hasn't changed...

ianamartin9y ago· 2 in thread

SQL was my entry point into software development, and I have a somewhat emotional attachment to it. And I'm quite glad that it worked that way.

SQL, relational theory, and set theory are a great place to start understanding how to work with data. And a great way to start understanding software.

All software deals with data. If you don't have a good understanding of data, you are never going to have a good understanding of software.

One of the best books I've ever read was Applied Mathematics for Database Professionals by Lex de Haan, and Toon Koppelaars. I think that's the database equivalent of SICP. You need to read it and understand it if you want to seriously deal with data. And you want to if you want to write software.

I'm obviously biased because of the way I got into things, but I look at things as a top-down vs a bottom-up point of view. I was a violinist and music theorist before I got into technology, and the bottom-up approach has always resonated with me.

In classical music, pretty much everything bubbles up from a baseline foundation and a structure; the stuff at the top that you actually see is a result of that structure. You don't start with some notes that you want to play on an instrument and then go and try to find a structure that supports those notes. You go bottom-up. You lay the foundation and build on that.

It was easy for me to map that idea of musical theory onto a database early in my career. And I moved up in the stack as I needed to. I started by building things entirely in SQL. You want complex statistical analysis? Sure, I'll do that . . . in SQL. Because I didn't know any better.

Then I found out that there are actually other languages that can do certain types of things much better. R, Python, C#, etc.

11 years later, I'm now very capable in a number of languages, and I don't suck. Along the way, I've had to put a lot of effort into learning the things I would have got from a comp sci degree program, and I'm probably not the best at certain types of software challenges.

I use noSQL stuff for caches and data warehouses, I use some of that for offloading traffic and keeping the reads separate from the writes. But there isn't a project that I touch that doesn't involve SQL in some way.

SQL is incredibly useful every day. Learning it, knowing it, understanding it, is a bare minimum for people I hire.

If you have a comp sci degree, and you don't know SQL, I'm going to probably write you off. If you have a liberal arts degree of some kind, and you do know SQL, I'm probably going to hire. You can learn everything else on the job.

None of that is an excuse for the totally shitty article linked here.

We use SQL today because it's good and it works. Not many languages can say that these days.

yawaramin9y ago

Come to think of it, SQL has basically been my gateway into tech too. At one non-tech job, I used MS Excel's built-in SQL engine, MS Query, to basically implement an upsert. At another, I used MS Access to build a pricing calculator application and pricing SQL scripts which reduced hours-long processes into instants.

Now as a Scala developer, my team has actually decided to use PostgreSQL, and I'm still in the process of trying to convince them to fully embrace SQL instead of just using it as a dumb cache to ease memory pressure on our backend.

wwweston9y ago

Thanks for the book recommendation.

(And if you're hiring, I have a math degree and know some SQL. :)

crimsonalucard9y ago· 2 in thread

When the best SQL guy is a dude that memorized a bunch of language hacks to get the underlying algorithm to be more efficient I question the design principles of the language. Instead wouldn't it be better for the language to explicitly allow the user to apply algorithms or procedures to make things more efficient rather than apply hacks?

The language is too heavy of an abstraction away from what's really going on under the hood. In a way it suffers from the same issue as functional programming. Not saying functional programming/sql is bad but... it has issues like almost everything.

remotehack9y ago

SQL is about relational theory; all that matters is the data.

> Instead wouldn't it be better for the language to explicitly allow the user to apply algorithms or procedures to make things more efficient

That...is hacking.

crimsonalucard9y ago

No it is not hacking.

By explicit I mean BinarySearch(Table, x = name) rather than "SELECT * FROM Table WHERE x = name"

Let me explain to you why "explicit" is better... Why should "SELECT column_name1, column_name2 FROM table" be more efficient than "SELECT * FROM table"? The abstraction is so leaky that in order to make a query better you resort to a language hack that only makes sense when you understand the instructions SQL compiles down into. This is bad. Leaky abstractions are bad. I shouldn't have to know what the SQL query is doing to optimize....

In web development your application servers use languages like go or python that are closer to the metal which allow us to explicitly deploy certain algorithms without this strange layer of SQL expressions that compiles to imperative code. This leads to faster applications that are easier to optimize at the expense of using terse highly abstract expressions such as those found in SQL.

Here's the strange part of web development. Everyone knows that the bottleneck for most websites are in the database. Yet why do we deploy easily optimizable imperative languages in the application server while putting a highly inefficient SQL expression language over the main bottleneck (the database)?

Shouldn't it be the other way around? Shouldn't we have Web application servers written in highly abstract functional languages while Database languages written in easily optimizable imperative code that is closer to the metal?

1 more reply

throwit2mewillU9y ago· 2 in thread

It works.

pc869y ago

"It works" is a pretty poor reason to continue using something. Horses and buggies work just fine. Automobiles are better.

I think it's important to look at why SQL is still the best tool for the job 43 years later, especially in the current climate of going to production with 6-week old JS frameworks.

throwit2mewillU9y ago

It works my young Padawan.

ryanar9y ago· 1 in thread

A lot of mentions to ORMs being the problem, them being poor abstractions, etc.

In my work with Django's ORM I have run into problem queries as often as I have with doing SQL, and Django's ORM has never let me down.

ORM keeps your thinking in line with your object oriented code, and I find it very easy and natural to use. Who cares if there are a bunch of artificial keys underneath, imo caching queries is a better solution to slow queries than trying to optimize SQL. So using the ORM is never a pain point.

The other advantage is automatic escaping to prevent SQL injection, which is still a top contender on OWASP's list. I never have to worry ahout SQL injection when using the ORM.

Maybe other ORMs are poor solutions, but at least with Django I have been very happy using it.

scriptkiddy9y ago

+1 for Dajngo's ORM.

I find it to be extremely simple to work with. It also does a fantastic job generating efficient queries due to the way it forces you to structure your models. The `F`, `Q`, and aggregation functions are top-notch as well. If none of the ORM features suit you for a particular query, Django ORM allows you to write the raw query yourself in a secure manner. Couple this with the Migration system, and I doubt you'll find a better ORM suite anywhere else.

I find Python has some of the best ORMs out there between Django ORM, SQLAlchemy, Peewee, and Pony.

18nleung9y ago· 1 in thread

Does anyone know how to generate a visualization like the one under reason #2 ("Battle Tested")?

Here's the gif: http://imgur.com/K5a7U9O.gif

krallja9y ago

http://logstalgia.io/

csours9y ago· 1 in thread

Since a lot of people are thinking about this: Is there a good way to compose SQL?

I see a bunch of repeated items in many SQL queries, things that would be functions in another language.

One of my colleagues pointed out that this is indicative of poorly formed queries. What do you all think?

combatentropy9y ago

It is very likely that repetitive SQL could be helped by views (database views, not views in the MVC sense). Database views are just named queries:

  create view red_shirts as
  select *
  from shirts
  where color = 'red'
  ;

Then you can just say:

  select * from red_shirts;

This is a simple example. Views are normally much more intricate and useful. Basically any select-statement could be saved as a view.

Databases also let you define functions.

agentultra9y ago

I've been programming for most of my life. SQL has been a big part of my career. And I love it. It's one of my top 5 languages.

It's a nice, functional, declarative language in the vein of prolog and such. You just tell it the shape of the data you want, where to materialize it from, filter, aggregate, calculate the window of, etc... and the system figures out how to execute it as efficiently as it can. It beats out procedurally munging data by a long shot. It's more concise for many operations than ML-like variants.

It's a great tool to have. And understanding the underlying maths, relational algebra, is beautiful. I've found trying to implement your own rick-shod relational database is a good way to try to mechanically understand the theory. Then move on to implementing datalog... etc. The reason why SQL continues to stick around is that the fundamental theories are quite sharp! I'd appreciate a more concise syntax some days but overall I can't say I'm displeased. It's great!

dghf9y ago

> SQL and relational database management systems or RDBMS were invented simultaneously by Edgar F. Codd in the early 1970s.

Codd didn't invent SQL. Donald Chamberlin and Raymond Boyce did.

> SQL is originally based on relational algebra and tuple relational calculus

Maybe "originally based on", but not "an implementation of". For example, it is perfectly possible for an SQL query to return duplicate rows, which isn't possible under relational algebra/calculus, a relation being by definition a set of rows (or, more precisely, of tuples).

jaked899y ago

"It’s like how MailChimp has become synonymous with sending email newsletters. If you want to work with data you use RDBMS and SQL. In fact, there usually needs to be a good reason not to use them. Just like there needs to be a good reason not to use MailChimp for sending emails, or Stripe for taking card payments."

Wow, that's a subtle, almost unnoticeable promotion of MailChinp. /s

swalsh9y ago

I'm not as old as some of you guys, been programming professionally for 11 years. The one constant in my life is SQL. I like it. My C# code is all obsolete now, the javascript code I wrote a year ago is obsolete, the Ruby code I've written never became a successful business. About half the PHP code I've written has been rewritten by now probably. But the SQL I've written is still living on, the tables I designed are still in use, the queries are still querying.

_pmf_9y ago

I'd really like for something like K/Kx to pick up. For server applications, the dichotomy between DB and application seems so artificial for a lot of applications. Think Erlang + Mnesia, but with a fully relational model backed by language primitives.

I think LINQ with F#'s type providers would probably be what I have in mind (which works like JOOQ with a tighter integration).

Animats9y ago

The great thing about SQL databases is that the expected standard of performance is "just works". Works all the time, for years, without trouble, even for the hard cases. All the major SQL databases, SQLite, MySQL, MariaDB, Postgres, Microsoft SQL Server, and Oracle, achieve that.

Contrast this with most webcrap. Or most of the NoSQL databases.

cygned9y ago

In my experience, (O)RDBMS and SQL are very good solutions for most of the business cases. I know a lot of projects that jumped on the "NoSQL for everything" train and eventually migrated partly to a RDMBS.

I often don't understand why people try to avoid SQL by any costs instead of just learning and applying it properly. I don't understand those "we do SQL for everything" teams either.

RDBMS in conjuction with NoSQL solutions can be a very powerful combination. We do a lot of Postgres + Redis + CouchDB in my projects.

dangoldin9y ago

A bit of a plug but I wrote about this a week or so ago describing SQL as the perfect interface. Databases change and evolve but since they all wrap the underlying engine in SQL it becomes very easy to use new technologies under the same interface: http://dangoldin.com/2017/04/11/sql-is-the-perfect-interface...

edpichler9y ago

This remembers what my teacher said to me a decade ago: "- Relational databases are the most successful software humanity have created.".

siscia9y ago

I love SQL and I believe it will have a big jump in popularity with microservices.

Each service should have a separate data source and being the sole responsable of a specific part of the data.

In this environment however a full RDBMS is a little an overkill.

The solution I am working on is RediSQL: https://github.com/RedBeardLab/rediSQL

It is a Redis module that embed SQLite.

Redis is nowadays a common piece in any infrastructure.

The little module plugs into Redis and expose a new command REDISQL.EXEC that provide the ability to run SQL statement.

It is multithread, does not impact the performance of Redis, and very simple to use.

Great write performance, I got 50k inserts per second on my machine, that should be enough for most microservices.

I would love any kind of feedback on the module or if you need any help to get you started just open an issues.

https://github.com/RedBeardLab/rediSQL

rconti9y ago

My favorite SQL fact: The San Carlos Airport (which is about 1mi from Oracle HQ; a plane losing power on takeoff would very plausibly crash into the towers, and in fact one did fall into the Redwood Shores lagoon a few years back, likely a choice by the pilot to avoid hitting a populated area) is KSQL, so airport code SQL. And its existence predates Oracle's headquarters being located there. It's just a coincidence.

KSQL and Oracle towers: http://www.bayareapilot.com/IMG_0317%20Large%20Web%20viewnea...

EternalData9y ago

Good old SQL. Your reliable friend that always shows up with the right amount of booze and gas money, and which, when you stop to think about, basically hasn't ever majorly fucked up around you.

davidw9y ago

I think back to my first programming job and some of the stuff I used then: Perl, early versions of PHP and other such tools that I haven't used in a long time.

Two of the things I still use to this day, though, are:

* Postgres

* Emacs

tannhaeuser9y ago

While SQL has a large class of uses, and also a smaller class of problems for which better solutions exist (CAP constraints, time series and other massive self-join apps, session storage ...) the reason we're going to use SQL in another 40 years still is that there's no cross-vendor standardization effort going on anymore (NoSQL vendors don't seem to find it necessary to drive sales and market growth, and customers don't demand it either).

LeanderK9y ago

Is there any work on a successor to SQL (just the language, maybe as an optional frontend)? I am not a fan of SQL, it works great for simple queries, but fails (in my opinion) for more complicated ones. They get way to complicated and hard to understand for something that would be easy in other languages. This is not a critique of relational databases, only the language.

manigandham9y ago

The amount of confusion in the comments highlights why we don't have anything better.

SQL is just a language, it specifically stands for Structured Query Langauge. It has nothing to do with the underlying database.

Relational databases all implement SQL because that's what the language was originally created for but it's just an interface. Relational databases can also implement other interfaces like mysql with its X protocol. Other database types like key/value, document, graph, columnar, time-series, RDFs, etc can also implement SQL and many are starting to for easier interoperability, like Cassandra with CQL.

There is definitely potential for a better query language and there are examples like ReQL and GraphQL but SQL is still just fine for most use-cases.

rumcajz9y ago

Unlike with imperative languages which are dozen a dime, there's almost no alternative to SQL when it comes to relational languages.

That being the case, people rarely even think about whether SQL is a good language or a bad language, whether it's lacking something etc.

But once you actually try thinking about it, it turns out that it's a pretty well designed language and any alternatives you can think of are usually much inferior to it.

Skylled9y ago

I think it's funny when the author compares SQL being most loved with least dreaded. They're the same thing. The percentages all add up to 100.

jordanthoms9y ago

SQL's just so much more flexible than the competing query languages, there's not a true alternative to it. One syntax does a decent job for transactional processing, key/value tables, heavily relational data, massive analytical processing, data warehousing, etc. It's pretty ugly, but it's flexible enough that you can get the job done even if you need to do something complex.

arnon9y ago

When we approach a customer with our database, SQL compliance is super important to them.

Some of our competitors used to be 'SQL-like', and even they swapped to using full SQL.

I think the fact that SQL is based on solid mathematical principles really helps it stay relevant.

carapace9y ago

Should mention SQL wasn't the original RM language: https://en.wikipedia.org/wiki/Alpha_(programming_language)

;-)

threepipeproblm9y ago

Possibly relevant: Modern SQL is a good resource, by Markus Winand, on the newer aspects of the language. http://modern-sql.com/

gigatexal9y ago

You can get away with not using SQL if using the functionality from LINQ or Java streams but as a DBA I feel most at home with SQL.

eddd9y ago

SQL as language is one thing. The fact that RDBMS systems with some consistence and isolation guarantees is a different story.

misterbowfinger9y ago

Meh. I'm not one to sing the praises of SQL so highly. I understand it's history and its use - but it's also really, really difficult to understand and figure out complex SQL queries.

Personally, I thought REQL was a really interesting take on query languages. As a developer, it allowed you write much cleaner code. You barely need an ORM. REQL kinda sucks for analysts at first, but in the long run, it makes writing complicated queries much, much easier.

sebringj9y ago

It lists Redis in there for the SQL ones. Redis has SQL now?

wittgenstein9y ago

I'm wondering what is the revenue of SQLizer?

z3t49y ago

i guess a lot of performance was sacrified and a lot of optimizations made, witch make sql very fast in the current era.

j / k navigate · click thread line to collapse

382 comments

217 comments · 50 top-level

konradb9y ago· 28 in thread

charles-salvia9y ago

[1] https://en.wikipedia.org/wiki/Object-relational_impedance_mi...

MarHoff9y ago

But SQL isn't designed around objects with fields, it's designed around tables and rows.

I think a more correct analogy would be that table are like classes, columns are the properties, and rows are instances. And so defining foreign keys is like setting a pointer to a parent instance.

There is not direct analogy for methods, but you can use function/trigger to do the same job.

It's clearly not convenient for general programming, but as soon as data manipulation is involved you benefit from a lot of built-in optimization.

3 more replies

corpMaverick9y ago

2 more replies

enobrev9y ago

I've only used non-relational stores for portions of projects that explicitly warrant them, since.

VLM9y ago

arjunrc9y ago

Loved the vampire/garlic analogy!

jasode9y ago

>a phenomenon of people finding SQL 'too complex' and moving to noSQL?

Fyi... the "SQL vs noSQL" means at least 2 different ideas and some of your replies are highlighting one aspect but not the other.

konradb9y ago

VLM9y ago

I'd toss out a third explanation for nosql is its not really a language thing but a concept of normalized/denormalized thing.

acdha9y ago

The two factors I saw:

stillkicking9y ago

AndrewOMartin9y ago

As a guess, SQL and noSQL are different tools good for different things and before noSQL people were trying to hammer their document-store shaped peg into a relational-database shaped hole.

jaredklewis9y ago

As the web grew, so did the scale of websites.

Many sites and apps found themselves in a situation where they would gladly trade strong consistency, for more performance and eventual consistency.

Various NoSQL solutions were developed to fill these needs.

Sure, they were overused for a bit (though that fad has basically passed), but they definitely arose to fill a real need.

TimJYoung9y ago

JustSomeNobody9y ago

I don't believe this is the best way to decide on a technology.

jfroma9y ago

This doesn't make much sense to me. I think the answer is just lack of experience.

dboreham9y ago

It is hype driven. Or perhaps driven by a failure to understand the subject, coupled with a natural tendency to look for simple, and ultimately trite explanations for everything.

bushin9y ago

The pot calling the kettle black.

metaphorm9y ago

relational data modeling is something you will learn from experience once you're developing an application where the business domain data is inherently relational.

hackits9y ago

sixdimensional9y ago

erikb9y ago

flarg9y ago

Just from my pov, SQL does not support a document view of data and does not easily support versioning of same and this is a huge problem certainly in complex business systems.

dragonwriter9y ago

> SQL does not support a document view of data and does not easily support versioning of same

kwillets9y ago

Honestly, I think there's a cargo cult of people who want things to be impressively complicated.

tkyjonathan9y ago

I think you missed the part in the article where it said “SQL - it’s so easy marketers can learn it.”

bitwize9y ago

Because MongoDB is web scale. It doesn't use SQL or joins and that's the secret ingredient in the web scale sauce. That and sharding.

mdpopescu9y ago

I think the people downvoting this haven't seen [1] :)

[1] https://www.youtube.com/watch?v=b2F-DItXtZs

1 more reply

cyberferret9y ago· 26 in thread

Even though I use ORMs in my project these days, every time I have to test a complex query, I write it in raw SQL first and check it before trying to make the ORM duplicate the same query.

Granted, NoSQL has its place and its advantages, but for me, when it comes to "money code", I will stick to SQL.

matwood9y ago

fnord1239y ago

> I think the lack of understanding of SQL and bad ORM experiences (Hibernate WTF) is what led people to think SQL/RDMBs were the problem when in fact they were not.

4 more replies

jasode9y ago

>Schemaless does not exist. There is always a schema, except in a schemaless data store, the schema has been moved to app code

[1] https://en.wikipedia.org/wiki/Ontology

4 more replies

techno_modus9y ago

> Schemaless does not exist. There is always a schema...

3 more replies

fortpoint9y ago

Fairly or not, he suggested the current fight between noSQL/Schemaless vs SQL/RDBMS was being fought in ignorance of all that went on in the 70s.

emilecantin9y ago

ZenoArrow9y ago

1 more reply

ams61109y ago

Agree so much.

The first thing I do for almost any new system is design the data model. Once that satisfies all the requirements, building an application and UI on top of that is usually pretty straightforward.

1 more reply

sixdimensional9y ago

Just like a table without a schema... because after all, it's still likely objects all the way down.

There is no such thing as working without a "schema", because software simply can't function without data and corresponding metadata to describe it.

lukaseder9y ago

> There is always a schema

It has a name. Schema-on-write (static schema) vs. schema-on-read (dynamic schema).

Sir_Cmpwn9y ago

I ended up using NoSQL for ephemeral data and SQL for persistent data in most of my applications.

gregmac9y ago

I'm a big fan of Dapper for this, so a lot of my data layer code looks like:

    public IEnumerable<Order> LoadOrders(int customerId) 
    {
        using (var db = GetConnection())
        {
            return db.Query<Order>("SELECT * FROM orders WHERE customerId = @customerId", new { customerId });
        }
    }

senorjazz9y ago

I fully agree. For simple things ORMs can be useful (very simple) anything else they get in the way.

2 more replies

swalsh9y ago

1 more reply

electrum9y ago

That looks very similar to JDBI for Java: http://jdbi.org/

ReidZB9y ago

At my workplace, we've started using jOOQ for this sort of thing. https://www.jooq.org/ (albeit we're a Java shop, not a C# shop)

It still sort of generates SQL for you, but really you end up writing the SQL yourself except in a type-safe way.

overcast9y ago

cryptonector9y ago

Typically there are two arguments for NoSQL: "no SQL!" (i.e., "the SQL language is so ugly, difficult, and painful") and "no ACID" (i.e., eventually consistent, hopefully, maybe).

The first one is always demonstrated to be a terrible argument when someone creates a SQL alike for whatever NoSQL we're talking about.

I'm extremely skeptical of NoSQLs, as you can tell. I would say that NoSQLs have NoPLACE. (Certainly as to that first argument.)

Zak9y ago

4 more replies

samirillian9y ago

To be clear, though, you're talking more about the Relational Model than SQL per se.

cryptonector9y ago

There aren't very many query languages for relational DBs that come close to SQL's power.

pscarey9y ago

I'm currently using Firebase at work (startup, mobile app, web dashboards), and I think it's got some great features.

As a bonus, additional features like Authentication, Analytics, Push Notifications, etc are all convenient to have bundled up.

cyberferret9y ago

I may still revisit it sometime, and see if it will suit another project.

[0] - https://hackernoon.com/tophn-a-fun-side-project-built-with-v...

pkulak9y ago

manigandham9y ago

What does this have to do with the article? SQL - the query language - is completely separate from any database that implements it.

ransom15389y ago· 19 in thread

pjmorris9y ago

vog9y ago

Note that "tables" in this statement had a different meaning than today. That one wasn't about relational databases, it was about tables as in-memory data structures.

In today's words, "flowcharts" means "code" and "tables" means "data structures".

I don't care if it is:

- tables in a relational database

- nested structs (records) and lists

- nested dictionaries (hash tables) and lists

- JSON

- XML

- ... whatever

1 more reply

scarface749y ago

Show me your well structured code and well named unit tests and I won't need your flow charts.

Show me your well defined normalized tables and I have all I need.

But for the love of $entity please don't show me tons of business logic in stored procedures. I actually won't work for a company that expects me to build or maintain a product based on stored procs.

2 more replies

mannykannot9y ago

forgotpwtomain9y ago

geebee9y ago

jmfurlott9y ago

jgeraert9y ago

Seeing this as well in proprietary software sold by one of the biggest erp software vendors... And it's still not fixed and likely won't be fixed anytime soon.

2 more replies

elsurudo9y ago

That's true, but you could always either create them yourself in SQL in a migration, or use a gem like foreigner (as I did). Always use FKs...

jrochkind19y ago

You put your business logic in the rdbms? As constraints and triggers and stored procedures and such?

hackits9y ago

For example too many times I've updated a record with a new column value only to find out that the value I've updated because of some trigger caused the value to be set to null.

cryptonector9y ago

One advantage: you get all the write concurrency your RDBMS can give you for your transactions. Too many ORMs I've seen just have a Big Lock, leaving you with just one writer at a time.

I can't think of disadvantages to putting your business logic in the RDBMS. Can you elaborate?

1 more reply

pasta9y ago

I think he meant putting constraint and index logic in the code.

For example: I've seen people rely on a unique index in the code. But if you set a unique index in the DB you know it will be unique always. Even when the code is replaced some day.

Sean17089y ago

I'm not sure I'd call constraints business logic, and I would definitely want to at least double-check them in the database.

cr0sh9y ago

brudgers9y ago

mkesper9y ago

Before databases you had only flat files or VSAM (and before that, ISAM) for your data. Good luck debugging your data in a VSAM cluster. https://en.wikipedia.org/wiki/VSAM

2 more replies

1ris9y ago

collyw9y ago

Personally I find SQL to be one of the easiest languages to read (when reading code from other developers).

davnicwil9y ago· 17 in thread

sergiosgc9y ago

At project onset, usually only redis or memcache are in use.

There's an obvious bottleneck on writes to the relational DB but I've so far managed to keep from hitting that ceiling.

scarface749y ago

1 more reply

Kiro9y ago

elmigranto9y ago

> no schema

All that work could've been done by your database (which is probably has some kind of JSON implementation anyways), but somehow it's a "no-brainer" to implement data consistency at app-level. Okay.

2 more replies

72deluxe9y ago

Couldn't you have designed your schema to have a playerProperties table with a playerId column, a key column and a value column?

That'd be forever extensible and if you made the primary key a combination of playerId and key, you'd offload duplicate handling to the database.

Seems simple to me.

1 more reply

pinouchon9y ago

fs1119y ago

jerven9y ago

public UniProt.org went from SQL normalized -> SQL primary key/blob to proper K/V (BDBje) to custom K/V in the space of ten years.

However, data production is still a mix of batch and sql systems.

zzzcpan9y ago

33degrees9y ago

If you consider ElasticSearch a NoSQL solution, then yes, many stories. A properly normalized database can be very slow to search, and throwing ElasticSearch in front of it is a very common solution.

prashnts9y ago

We used to use Postgres for almost everything, including full-text search.

danso9y ago

I had assumed Facebook started off with MySQL and now is NoSQL but that was just ignorant of me:

http://docs.datastax.com/en/articles/cassandra/cassandrathen...

The acknowledgments section makes it more clear that the MySQL data was indeed migrated over to Cassandra:

[1] https://news.ycombinator.com/item?id=7891316

[2] https://www.facebook.com/notes/facebook-engineering/the-unde...

johnchristopher9y ago

I suppose they'd look first into MySQL's new JSON support. (Or other SQL database that have some kind of JSON support)

Klathmon9y ago

PostgreSQL has fantastic JSON support and i've used it a few times.

kalleboo9y ago

Does using memcache for caching count as "augumenting with a NoSQL solution"? Because that's pretty popular...

FLUX-YOU9y ago

Logging

lr4444lr9y ago

Yeah, I use MongoDB whenever my tables get too big. It's web scale! </s>

jackfoxy9y ago· 14 in thread

There just is no substitute for SQL. Some thoughts on what has given it a bad name:

3) SQL does not seem to be a required topic for undergrads. There are really no unsolved problems (of note), so it's not interesting to academics.

scriptkiddy9y ago

[1] In fact, I would make the argument that if you're altering primary keys at all, you're doing something wrong.

jackfoxy9y ago

1 more reply

gnaritas9y ago

1) Natural keys suck in the real world, artificial keys have come to dominate for a real reason that you either understand or suffer for ignoring. Natural keys are bad practice.

3) Should be.

pjungwir9y ago

> There are really no unsolved problems (of note), so it's not interesting to academics.

Here is your starting bibliography:

Richard Snodgrass, Developing Time-Oriented Database Applications in SQL.

Hugh Darwen & C.J. Date, "An Overview and Analysis of Proposals Based on the TSQL2 Approach".

Krishna Kulkarni & Jan-Eike Michels, "Temporal features in SQL:2011".

Tom Johnston, Bitemporal Data

Magnus Hagander, "A TARDIS for Your ORM": https://www.youtube.com/watch?v=TRgni5q0YM8

I would love to read what you come up with!

nickpeterson9y ago

I wish there was more work on the ORM side supporting some of these concepts (Schema evolution and temporality).

jackfoxy9y ago

Yes, I would agree, temporal databases is an interesting topic.

cheez9y ago

Natural keys can be composite which makes it a pain to manage cross-table relationships. I typically have a unique constraint on the actual composite key and a auto inc id.

gnaritas9y ago

That is the correct solution, a surrogate key.

1 more reply

sobani9y ago

Can you give 1 example of a good natural key?

Note I will disqualify anything that has a reasonable chance of changing, like the primary email address of an account, a persons name, a persons day of birth or the 'public id' of a bank account

yawaramin9y ago

A composite key made up of foreign keys in a junction table.

irishsultan9y ago

How often does the birthday of people change? (Then again, how likely is it to be unique?)

1 more reply

ASipos9y ago

I studied SQL as the lab portion of a database design course. The sequel to that course was wholly dedicated to PL/SQL.

neuromantik80869y ago

And the sequel to that was postgrad?

combatentropy9y ago

The sequel to the SQL? ;)

1 more reply

js89y ago· 9 in thread

SQL is good, but it shows its age. Today somebody should come up with something statically typed and more functional (meaning using lambda calculus as a starting point).

The biggest pain points of SQL (IMHO) are:

- lack of statically typing guarantees (for example, no guarantee that a certain table has certain column)

- bad capability to abstract over parts of the data model (for example, queries have to specify the table that they query)

Both of these can be resolved with use of good enough functional language. There are projects like that in the FP/Haskell community (e.g. Ermine), but it's fragmented.

laumars9y ago

You might need to explain yourself a little more because I'm confused by your examples:

> "- lack of statically typing guarantees (for example, no guarantee that a certain table has certain column)"

> "- bad capability to abstract over parts of the data model (for example, queries have to specify the table that they query)"

    select * from people p, borrowed_books bb where p.name = "laumars"

SQL definitely isn't a pretty language though so I am very interested in your thoughts for how a more intuitive, functional-inspired, query language might read.

js89y ago

I think marcosdumay already explained it pretty well.

> I would have described SQL as statically typed because columns are given a type when created.

Both statically and dynamically is a misnomer because it's akin to something more in-between - like a dynamic language which can create, say, an empty list and declare it can only contain integers.

I think better term (which would explain the distinction between schema creation and query compilation) would be static typechecking of queries against the schema.

> The issue arises with multiple tables

I think the language should be functional, but also total (every function will finish, no recursion allowed), to make it easy for compilers (query engines).

1 more reply

marcosdumay9y ago

> I would have described SQL as statically typed because columns are given a type when created.

> You don't need to specify the table name for each column if you're only querying one table.

I do think the GP was about doing stuff like that:

    SELECT * FROM function_that_returns_table();

2 more replies

elmigranto9y ago

> statically typing guarantees

int will be an int. You can't store string there. What else do you need? Something along the lines of SQL's `check` or Postgres's domains?

> queries have to specify the table that they query

How do you imagine not doing that, something like "pull some things and do stuff with it"? I think we are quite far from that kind of reality.

> certain table has certain column

I don't get it. It either has, or it doesn't. In one case you get a value, in other one — SQL error. If that's so vital, pull up some schema information and check before running an app.

deathanatos9y ago

> What else do you need?

Me personally: sum types. (Some languages call these tagged unions. Rust calls it an enum, but note that enum here does not mean "an integer under the hood")

  CREATE TABLE (
    ...
    count_or_empty union { int | empty } NOT NULL,
    ...
    PRIMARY KEY (..., count_or_empty, ...)
  );

2 more replies

techno_modus9y ago

> SQL is good, but it shows its age

"The relational model is dead, SQL is dead, and I don’t feel so good myself". [PDF] - https://sigmodrecord.org/publications/sigmodRecord/1306/pdfs...

> Both of these can be resolved with used of good enough functional language

Yes, functional aspects is exactly what is missing in RM in general and SQL in particular. Yet, I do not think this can be easily added - the whole paradigm has to be changed.

marcosdumay9y ago

But I do completely agree with the pain points you pointed there. I'd go further on the first point, client software should have an easy way to check its guarantees at connection time too.

js89y ago

> There's something to gain by adding functions as first-class objects, but I don't think rethinking it from functional foundations would help.

So what other approach to designing a declarative language do you want to take?

My question is not rhetorical. Building declarative languages based on typed lambda calculus is very well understood.

metaphorm9y ago

can you elaborate? you haven't really explained these ideas in a way that helps me understand them but it sounds interesting.

plet9y ago· 9 in thread

Its still amazing how far you can go with a single table and few tweaks to a postgres instance.

collyw9y ago

Sounds like you need to learn how to design a database. Seriously, do you find more than 1 table complex?

plet9y ago

That's odd. I don't recall saying more than 1 table is complex. For me, it's just a good time to think more about what kind of database to use, esp for pet projects.

1 more reply

noja9y ago

As soon as you have more than one table you switch to NoSQL?

Did you meant to write it that way round?

plet9y ago

> As soon as you have more than one table you switch to NoSQL?

danso9y ago

alunchbox9y ago

just look at the nightmare MongoDB has created in most startups a year later.

mysterydip9y ago

The SQLite site says it better than I could: https://sqlite.org/whentouse.html

scarface749y ago

I fell in love with Mongo when I first started using it 6 months ago - the whole NoSql thing appealed to me. Then I fell in love with Sql Server 2016's JSON support - the best of both worlds.

plet9y ago

I'm not sure about the mongoDB nightmare, for me its done everything as the documentation claims it'll do.

2 more replies

chrisan9y ago· 7 in thread

Most(?) of the articles I read on HN, and then their comments, always seem to put NoSQL in a bad light when people use it for things that "should" be in SQL

What _are_ the "correct" use cases for NoSQL? Everything has always been relational data for me

metaphorm9y ago

> What _are_ the "correct" use cases for NoSQL?

2. Key:Value store for cacheing. this is the most basic, canonical use case. Memcached, redis, etc.

4. Time series. arguably this is just a subset of relational data, or a domain specific language extension over traditional SQL, but still, it feels a little different in practice.

dragonwriter9y ago

> What _are_ the "correct" use cases for NoSQL?

BoorishBears9y ago

marcosdumay9y ago

You can use a NoSQL database when your registers have no relation with each other (not even in different tables).

Still, it's not automatically a good choice just because the above applies. It's guaranteed only to not be an horrible choice.

jalayir9y ago

> What _are_ the "correct" use cases for NoSQL?

shakna9y ago

Something I made that was simpler as NoSQL: a comment system.

Specifically a page had a list of comments, and a comment was a date/time and some text.

There was no cross-referencing or nesting, just a basic ordered list.

There is a relationship there, but it's a simple 1:1 relationship.

ubernostrum9y ago

Depends what you mean by "NoSQL".

I've gotten plenty of mileage out of using key-value stores to complement relational databases. They're good at acting as task queues and caches, for example.

danso9y ago· 6 in thread

ProPublica just published a bunch of data-related jobs and positions. The phrase "Proficiency in SQL is a must" makes an appearance: https://www.propublica.org/atpropublica/item/propublica-is-h...

[0] https://news.ycombinator.com/item?id=8505000 https://news.ycombinator.com/item?id=10585009

wodenokoto9y ago

I've recently started learning Pandas (dataframe library for python) and i find most queries cumbersome compared to the absolute minimal SQL I know

lowmagnet9y ago

pweissbrod9y ago

  def getdataframe(sql):
     try:
          connection = pyodbc.connect("DSN=myDSN", autocommit=True)
          return pandas.read_sql(sql, connection)
      except Exception as e:
          print repr(e)
      finally:
          connection.close()

arkh9y ago

> with the exception of spreadsheets and formulas

You may want to check Oracles MODEL clause. Not as powerful as a spreadsheet but you can already do a lot of things in the middle of your SQL.

danso9y ago

[0] http://www.orafaq.com/node/69

[1] http://sqlitebrowser.org/

kwillets9y ago

cm21879y ago· 5 in thread

maxxxxx9y ago

72deluxe9y ago

What you are lacking is:

a) knowledge of your chosen database system in sufficient detail to know the trade-offs and pros/cons, and

b) sufficient knowledge of SQL specific to your chosen database system in order to get the most out of it.

What you need is more knowledge, not more tools. Even if I have a hammer and a chainsaw, both are useless to me if I don't know enough about wood to build what I want.

1 more reply

antisthenes9y ago

The experience you describe is exactly like any other programming language. If I look at a Haskell program, it will look like gibberish to me, because I don't know the syntax.

majewsky9y ago

> It reminds me a litte of regular expressions. If you don't use them all the time you forget the syntax and have to look it up every time you need them.

  Perl:   /^(?:aa|bb\b)+/
  vim:    /^\%(aa\|bb\>\)\+/

tofflos9y ago

Having said that I agree with you that there is plenty of room for improvement in the SQL-language itself.

cousin_it9y ago· 4 in thread

weavie9y ago

The problem is that what joins to what differs depending on the context for what you are looking for.

Sometimes you join an order to supplier, other times it will be to customer and other times it will be to a list of products. Sometimes, all of them need to be included.

cousin_it9y ago

I guess the idea is to make the schema as readable as possible using nesting, and then use indices for the rest.

TimJYoung9y ago

Yep, it was one of the things used in quite a few systems before SQL became more widespread:

https://en.wikipedia.org/wiki/MultiValue

jimktrains29y ago

> One of my dreams is building a hybrid of RDBMS and Protocol Buffers.

That's actually one my goal with https://github.com/jimktrains/drsdb. All communication would be done via protocol buffers and schemas could be created or loaded at runtime to be verified and used.

The primary goal begin a distributed RDBMS written in rust.

zephyrfalcon9y ago· 4 in thread

[1] https://en.wikipedia.org/wiki/QUEL_query_languages

default-kramer9y ago

HTSQL has some great examples of where SQL could be better: http://htsql.org/doc/overview.html#why-not-sql. I would love to get that goodness in a language not tied to the rest of HTSQL.

Zak9y ago

There have been a few languages that compile to SQL. CLSQL for Common Lisp and Korma for Clojure come to mind.

jmcqk69y ago

marcosdumay9y ago

SQL is a great language. So great that it's a superset of everything good ever created on this domain¹, but still cognitively simple.

There is space for statistical and search-based (like Prolog) languages, but those are very niche. Proof of that is that they exist, but yet nobody here is talking about them.

1 - Ok, object store query languages are not a strict subset of standard SQL. But it's only a matter of adding one or two commands, like Postgres does.

baldfat9y ago· 3 in thread

SQL is the second most underused tool we have with dealing with data. AWK being the first. SQL is great because the logic works with dealing with data and forces you to make good decisions earlier.

Everyone should spend a day learning SQL if for no other reason they the ability to think logically about data.

majewsky9y ago

awk is really nice. I'm starting to use it more and more in places where I previously used a long pipe of grep, cut and sed.

GuB-429y ago

I use perl when grep/cut/sed show their limits. I never really got into awk.

I suppose we all have our favorite tools.

1 more reply

tkyjonathan9y ago

Amen.

vinceguidry9y ago· 3 in thread

What I want to know is why nobody ever came up with a better query interface. Every abstraction I've ever seen was built on top of SQL.

jrochkind19y ago

What don't you like about SQL?

wolfgang429y ago

Instead, I've been toying with the idea of a language where queries are expressed as a pipeline of operations on result sets, like this:

    FROM employee
      /* result set: all employees */
    LEFT JOIN department ON employee.DepartmentID = department.DepartmentID
      /* result set: all employees + their department */
    ORDER BY employee.last
      /* result set: all employees + their department, sorted */
    WHERE department.name IN ('Sales', 'Engineering')
      /* result set: Sales and Engineering employees + their department, sorted */
    FIELDS employee.first, employee.last, department.name
      /* result set: Sales and Engineering employees' first, last, and department names, sorted */

This is a SELECT, but you can for example turn it into an update just by adding a clause to the end:

    UPDATE employee.salary = employee.salary * 1.05

4 more replies

spacemanmatt9y ago

skc9y ago· 2 in thread

I've always thought the main reason NoSQL solutions became popular is that developers could finally get at and manipulate the data the way they wanted to.

I've known some very, very prickly DBA's in my time who referred to the databases they looked after as "My database". So they would say things like "Don't put junk in my database"

And would give you endless grief over how you wrote your queries or asked you a million questions about why you needed a new table and why your proposed design was shit.

As a result, many of us devs tended to view SQL Databases as some sort of dark art. So in this regard, NoSQL is freeing at first glance.

But if I'm honest, once I got over my fear, the "pros" of NoSQL solutions in comparison to good ol' SQL seem to be relatively feeble.

I think it's easier to get up and running with a NoSQL solution because there is far less friction when it comes to rapid prototyping of ideas, but things get complex pretty quickly.

I'd also say that for the vast majority of applications out there, the difference between the two will mostly be a wash.

hackits9y ago

By any chance did these DBA's be the Linux server administrators at all?

duozerk9y ago

That seems highly unprofessional. I won't even ls into users' directories without their express permission.

1 more reply

menzoic9y ago· 2 in thread

majewsky9y ago

> the modern SQL languages we use today are younger than 43yrs.

99% of people use a miniscule subset of SQL, most of which easily dates back decades.

Twisell9y ago

But so does JavaScript it's not like it hasn't changed...

ianamartin9y ago· 2 in thread

SQL was my entry point into software development, and I have a somewhat emotional attachment to it. And I'm quite glad that it worked that way.

SQL, relational theory, and set theory are a great place to start understanding how to work with data. And a great way to start understanding software.

All software deals with data. If you don't have a good understanding of data, you are never going to have a good understanding of software.

Then I found out that there are actually other languages that can do certain types of things much better. R, Python, C#, etc.

SQL is incredibly useful every day. Learning it, knowing it, understanding it, is a bare minimum for people I hire.

None of that is an excuse for the totally shitty article linked here.

We use SQL today because it's good and it works. Not many languages can say that these days.

yawaramin9y ago

wwweston9y ago

Thanks for the book recommendation.

(And if you're hiring, I have a math degree and know some SQL. :)

crimsonalucard9y ago· 2 in thread

remotehack9y ago

SQL is about relational theory; all that matters is the data.

> Instead wouldn't it be better for the language to explicitly allow the user to apply algorithms or procedures to make things more efficient

That...is hacking.

crimsonalucard9y ago

No it is not hacking.

By explicit I mean BinarySearch(Table, x = name) rather than "SELECT * FROM Table WHERE x = name"

1 more reply

throwit2mewillU9y ago· 2 in thread

It works.

pc869y ago

"It works" is a pretty poor reason to continue using something. Horses and buggies work just fine. Automobiles are better.

I think it's important to look at why SQL is still the best tool for the job 43 years later, especially in the current climate of going to production with 6-week old JS frameworks.

throwit2mewillU9y ago

It works my young Padawan.

ryanar9y ago· 1 in thread

A lot of mentions to ORMs being the problem, them being poor abstractions, etc.

In my work with Django's ORM I have run into problem queries as often as I have with doing SQL, and Django's ORM has never let me down.

The other advantage is automatic escaping to prevent SQL injection, which is still a top contender on OWASP's list. I never have to worry ahout SQL injection when using the ORM.

Maybe other ORMs are poor solutions, but at least with Django I have been very happy using it.

scriptkiddy9y ago

+1 for Dajngo's ORM.

I find Python has some of the best ORMs out there between Django ORM, SQLAlchemy, Peewee, and Pony.

18nleung9y ago· 1 in thread

Does anyone know how to generate a visualization like the one under reason #2 ("Battle Tested")?

Here's the gif: http://imgur.com/K5a7U9O.gif

krallja9y ago

http://logstalgia.io/

csours9y ago· 1 in thread

Since a lot of people are thinking about this: Is there a good way to compose SQL?

I see a bunch of repeated items in many SQL queries, things that would be functions in another language.

One of my colleagues pointed out that this is indicative of poorly formed queries. What do you all think?

combatentropy9y ago

It is very likely that repetitive SQL could be helped by views (database views, not views in the MVC sense). Database views are just named queries:

  create view red_shirts as
  select *
  from shirts
  where color = 'red'
  ;

Then you can just say:

  select * from red_shirts;

This is a simple example. Views are normally much more intricate and useful. Basically any select-statement could be saved as a view.

Databases also let you define functions.

agentultra9y ago

I've been programming for most of my life. SQL has been a big part of my career. And I love it. It's one of my top 5 languages.

dghf9y ago

> SQL and relational database management systems or RDBMS were invented simultaneously by Edgar F. Codd in the early 1970s.

Codd didn't invent SQL. Donald Chamberlin and Raymond Boyce did.

> SQL is originally based on relational algebra and tuple relational calculus

jaked899y ago

Wow, that's a subtle, almost unnoticeable promotion of MailChinp. /s

swalsh9y ago

_pmf_9y ago

I think LINQ with F#'s type providers would probably be what I have in mind (which works like JOOQ with a tighter integration).

Animats9y ago

Contrast this with most webcrap. Or most of the NoSQL databases.

cygned9y ago

I often don't understand why people try to avoid SQL by any costs instead of just learning and applying it properly. I don't understand those "we do SQL for everything" teams either.

RDBMS in conjuction with NoSQL solutions can be a very powerful combination. We do a lot of Postgres + Redis + CouchDB in my projects.

dangoldin9y ago

edpichler9y ago

This remembers what my teacher said to me a decade ago: "- Relational databases are the most successful software humanity have created.".

siscia9y ago

I love SQL and I believe it will have a big jump in popularity with microservices.

Each service should have a separate data source and being the sole responsable of a specific part of the data.

In this environment however a full RDBMS is a little an overkill.

The solution I am working on is RediSQL: https://github.com/RedBeardLab/rediSQL

It is a Redis module that embed SQLite.

Redis is nowadays a common piece in any infrastructure.

The little module plugs into Redis and expose a new command REDISQL.EXEC that provide the ability to run SQL statement.

It is multithread, does not impact the performance of Redis, and very simple to use.

Great write performance, I got 50k inserts per second on my machine, that should be enough for most microservices.

I would love any kind of feedback on the module or if you need any help to get you started just open an issues.

https://github.com/RedBeardLab/rediSQL

rconti9y ago

KSQL and Oracle towers: http://www.bayareapilot.com/IMG_0317%20Large%20Web%20viewnea...

EternalData9y ago

Good old SQL. Your reliable friend that always shows up with the right amount of booze and gas money, and which, when you stop to think about, basically hasn't ever majorly fucked up around you.

davidw9y ago

I think back to my first programming job and some of the stuff I used then: Perl, early versions of PHP and other such tools that I haven't used in a long time.

Two of the things I still use to this day, though, are:

* Postgres

* Emacs

tannhaeuser9y ago

LeanderK9y ago

manigandham9y ago

The amount of confusion in the comments highlights why we don't have anything better.

SQL is just a language, it specifically stands for Structured Query Langauge. It has nothing to do with the underlying database.

There is definitely potential for a better query language and there are examples like ReQL and GraphQL but SQL is still just fine for most use-cases.

rumcajz9y ago

Unlike with imperative languages which are dozen a dime, there's almost no alternative to SQL when it comes to relational languages.

That being the case, people rarely even think about whether SQL is a good language or a bad language, whether it's lacking something etc.

But once you actually try thinking about it, it turns out that it's a pretty well designed language and any alternatives you can think of are usually much inferior to it.

Skylled9y ago

I think it's funny when the author compares SQL being most loved with least dreaded. They're the same thing. The percentages all add up to 100.

jordanthoms9y ago

arnon9y ago

When we approach a customer with our database, SQL compliance is super important to them.

Some of our competitors used to be 'SQL-like', and even they swapped to using full SQL.

I think the fact that SQL is based on solid mathematical principles really helps it stay relevant.

carapace9y ago

Should mention SQL wasn't the original RM language: https://en.wikipedia.org/wiki/Alpha_(programming_language)

;-)

threepipeproblm9y ago

Possibly relevant: Modern SQL is a good resource, by Markus Winand, on the newer aspects of the language. http://modern-sql.com/

gigatexal9y ago

You can get away with not using SQL if using the functionality from LINQ or Java streams but as a DBA I feel most at home with SQL.

eddd9y ago

SQL as language is one thing. The fact that RDBMS systems with some consistence and isolation guarantees is a different story.

misterbowfinger9y ago

Meh. I'm not one to sing the praises of SQL so highly. I understand it's history and its use - but it's also really, really difficult to understand and figure out complex SQL queries.

sebringj9y ago

It lists Redis in there for the SQL ones. Redis has SQL now?

wittgenstein9y ago

I'm wondering what is the revenue of SQLizer?

z3t49y ago

i guess a lot of performance was sacrified and a lot of optimizations made, witch make sql very fast in the current era.

j / k navigate · click thread line to collapse