This is a 101 rookie level approach to SQL or injection defense.
It's dumb for exactly the same reason why this is dumb
"SELECT * FROM foo WHERE bar=" + sanitize(userInput)
The correct way to do something like this will always be parameterized input which looks something like this "SELECT * FROM foo WHERE bar=?"
bindParameter(1, userInput);
Why? Because that the postgres protocol splits out the command and the data for the command in a way that can't be injected. Something that should be viewed as impossible to do when data and command are merged into 1 String.IF this company wanted to build dynamic queries, then the only correct way to do that is to limit input to only valid variables. IE "isValidColumnName(userInput)" before sending the request. And even then, you'd not use psql to do that.
You simply can't use a generalized sanitizer and expect good results.
Its been fairly effective at making them realize the fundamental mistake they are making. Quoting the key part:
> The only code that knows what characters are dangerous is the code that’s outputting in a given context.
> So the better approach is to store whatever name the user enters verbatim, and then have the template system HTML-escape when outputting HTML, or properly escape JSON when outputting JSON and JavaScript.
> And of course use your SQL engine’s parameterized query features so it properly escapes variables when building SQL
Or, stop using stringly template systems, and treat the data as what it is: a structured language, with well-defined grammar.
One of these days I need to write an article titled "Don't play with escaping strings. Serialize output.". Core idea being, "escaping your output" still looks too much like "sanitizing input"[0], and one tiny mistake is all it takes to give an attacker ability to inject arbitrary code into the page (or give an unlucky user ability to brick the page for themselves) - so instead of working in "string space", work in whatever semantics your output is, and treat the string form as a serialization problem. In case of HTML, that means constructing tree of tags as data structure, and then serializing them. Then, bugs in serializer notwithstanding, the whole class of injection problem disappears - you can't do "<h1>$text</h1>" -> "<h1></h1><script .... </h1>", when your "template" is made of data structures like [:h1, $text], because $text can't possibly alter the structure here. Etc.
In some sense, "Don't escape, serialize instead" is the complement of "Parse, don't validate".
(See also: make invalid states unrepresentable.)
--
[0] - Who ever sanitizes input? I've only ever seen this kind of sanitization the article describes happen in the output, within string-gluing templates.
Security by obscurity from self. It was very hard to explain to that person what was wrong with that line of reasoning.
"So... you're the smartest person in the world?"
> Why? Because [] the postgres protocol splits out the command and the data for the command in a way that can't be injected.
I'm not sure I'm comfortable with this. You can create a prepared statement and then pass user input to it as parameters, sure. https://www.postgresql.org/docs/17/sql-prepare.html
But who says that means your statement can't be injected? It would have to be true that the handling for EXECUTE statements is bug-free. Maybe it is bug-free. Or maybe it isn't. Maybe I can figure out just the right username to cause your prepared statement to have some undesirable side effects.
That wouldn't be SQL injection in the sense of putting hostile values into an SQL query in order to form a different SQL query, but it would be SQL injection in the sense of putting hostile values into an SQL query in order to accomplish nefarious goals via the database. The only real qualm I'd have about calling it "SQL injection" is that it wouldn't be portable across different database implementations; it would be more accurately described as "PostgreSQL injection".
If we feel entitled to assume that PostgreSQL's provided functionality to interpret strings that are provided to EXECUTE statements has no bugs, why aren't we also entitled to assume that PostgreSQL's provided functionality to interpret strings that are provided to the string escaper has no bugs? I don't really see the conceptual difference.
BeyondTrust used it as input to the 'psql' tool, which is an interactive tool you're not really supposed to programmatically invoke, and the documentation for the postgres escape function didn't say it escaped input for psql. Even though postgres was fine calling it a CVE and fixing it, I think this is 100% on BeyondTrust for assuming that escaping a string for a postgres query meant it was safe for psql.
If BeyondTrust had just used it as part of a postgres query string, the escape function would have been sufficient.
.... and that's also exactly the reason that using parameterized queries is better. With parameterized queries, the escaping and the query parsing are done in the same place, so there's no chance of confusion for the programming language's string library to get in the way, or for the psql tool's input parsing to re-interpret and alter the escaped string before sending it over the wire.
That's completely false. The following (pseudo code because working with C strings is verbose and beside the point) with nothing to do with psql should be fine:
PQexec(conn, "SELECT * FROM user WHERE nickname = '" + PQescapeString(user_input) + "';")
but thanks to the vulnerable PQescapeString(), the following user_input "\xc0'; DROP TABLE user"
would fuck it up. That's just the failed escape function leading to a classic SQL injection. Using psql makes it worse because psql can execute additional non-SQL commands, but this escape function is not "fine" at all with or without psql.> With parameterized queries, the escaping and the query parsing are done in the same place
Again, wrong. For parametrized queries, params don't go through serialization because they don't need to hit the parser, there's no "escaping" whatsoever.
> BeyondTrust used it as input to the 'psql' tool, which is an interactive tool you're not really supposed to programmatically invoke, and the documentation for the postgres escape function didn't say it escaped input for psql.
But this is the documentation for psql:
> psql is a terminal-based front-end to PostgreSQL. It enables you to type in queries interactively, issue them to PostgreSQL, and see the query results. Alternatively, input can be from a file or from command line arguments. In addition, psql provides a number of meta-commands and various shell-like features to facilitate writing scripts and automating a wide variety of tasks.
We can learn two things from this:
1. You are definitely supposed to be able to invoke psql programmatically, or else the suggestion to "write scripts and automate a wide variety of tasks" would make no sense.
2. The input to psql is documented as a "query", and it seems fine to assume that a "query" for psql is the same thing as a "postgres query".
Btw, even using psql directly allows binding parameters https://www.postgresql.org/docs/current/app-psql.html
Maybe future database systems will only accept queries serialized from protobuf, or JSON (output by a proper serializer)
"SELECT * FROM foo WHERE bar=?"
bindParameter(1, userInput);
But what happens when it turns out that isn't safe?This being Postgres, that process was likely completed decades ago.
Anticipating the next question, "but what if it is still unsafe?", the answer is that there is no fundamental reason why this can't be safe code. Security exploits like this require a cooperation between the receiving code and the incoming data, and it is in fact perfectly possible for the receiving code to be completely safe. There is no such thing as data just so amazingly hackerish that it can blast through all protections. There has to be a hole, and this is among the best-tested and most scrutinized code paths in the world.
It's all about defense in depth. Even a system that's tested all the way through with Coq or similar is still at the mercy of bugs in the specification or in underlying system libraries. But intentional API design can make it materially less likely that a security issue will arise, and that's worth a heck of a lot.
What beyond trust did was in fact what I said, constructing the entire query as one string and sending that on to psql. The sanitization method failed, but that's not what you should use anyways when dealing with user input.
If you are using a postgres client, then the message breakdown to postgres when you bind looks something like this (not the actual message stream format, just the jist of what it ends up looking like)
SELECT * FROM foo WHERE bar=$1
BIND 1 escape(userInput)
ENDBIND
There isn't the same opportunity to create malformed data that can cause an injection attack in the Postgres message stream. There are far fewer things that need to be escaped in order to put in the full message. (I skimmed through the protocol, one improvement I'd make to it is adding a length to the bind instead of having a termination like it appears to do. That would, however, preclude streaming data)These yahoos took a different route to running a command than what everyone does and should do and they were bit by it.
EDIT Actually, yes you can bind parameters with psql. However, it's there mostly as a way to test postgres and not something users are expected to use.