undefined | Better HN

0 pointscogman101y ago0 comments

> They sanitized the data, so it should have been fine.

This is a 101 rookie level approach to SQL or injection defense.

It's dumb for exactly the same reason why this is dumb

    "SELECT * FROM foo WHERE bar=" + sanitize(userInput)

The correct way to do something like this will always be parameterized input which looks something like this

    "SELECT * FROM foo WHERE bar=?"
    bindParameter(1, userInput);

Why? Because that the postgres protocol splits out the command and the data for the command in a way that can't be injected. Something that should be viewed as impossible to do when data and command are merged into 1 String.

IF this company wanted to build dynamic queries, then the only correct way to do that is to limit input to only valid variables. IE "isValidColumnName(userInput)" before sending the request. And even then, you'd not use psql to do that.

You simply can't use a generalized sanitizer and expect good results.

0 comments

18 comments · 6 top-level

Nijikokun1y ago· 4 in thread

His argument is that you are building on the assumption that this is safe:

    "SELECT * FROM foo WHERE bar=?"
    bindParameter(1, userInput);

But what happens when it turns out that isn't safe?

jerf1y ago

Then you, or someone else, fix the underlying library.

This being Postgres, that process was likely completed decades ago.

Anticipating the next question, "but what if it is still unsafe?", the answer is that there is no fundamental reason why this can't be safe code. Security exploits like this require a cooperation between the receiving code and the incoming data, and it is in fact perfectly possible for the receiving code to be completely safe. There is no such thing as data just so amazingly hackerish that it can blast through all protections. There has to be a hole, and this is among the best-tested and most scrutinized code paths in the world.

btown1y ago

There's some nuance here. Even the most battle-tested system, with distinct slots for executable code and primitive values, might have a way in which an untrusted primitive input can overrun a buffer, or be split in an unsafe way, and cause unexpected behavior. But there's a vast difference in attack surface between that, and "just give us a string, don't worry we'll sanitize it on our end."

It's all about defense in depth. Even a system that's tested all the way through with Coq or similar is still at the mercy of bugs in the specification or in underlying system libraries. But intentional API design can make it materially less likely that a security issue will arise, and that's worth a heck of a lot.

1 more reply

cogman10OP1y ago

That is not what's happening. You can't bind parameters in psql.

What beyond trust did was in fact what I said, constructing the entire query as one string and sending that on to psql. The sanitization method failed, but that's not what you should use anyways when dealing with user input.

If you are using a postgres client, then the message breakdown to postgres when you bind looks something like this (not the actual message stream format, just the jist of what it ends up looking like)

    SELECT * FROM foo WHERE bar=$1
    BIND 1 escape(userInput)
    ENDBIND

There isn't the same opportunity to create malformed data that can cause an injection attack in the Postgres message stream. There are far fewer things that need to be escaped in order to put in the full message. (I skimmed through the protocol, one improvement I'd make to it is adding a length to the bind instead of having a termination like it appears to do. That would, however, preclude streaming data)

These yahoos took a different route to running a command than what everyone does and should do and they were bit by it.

EDIT Actually, yes you can bind parameters with psql. However, it's there mostly as a way to test postgres and not something users are expected to use.

graphememes1y ago

They aren't saying that's what is happening but the underlying assumption is similar to that.

oefrha1y ago· 3 in thread

The more you work in software the more you should realize the developers writing security-critical software (in this case the one writing that sanitizer) are often/usually as clueless as you are. The solution? Hard to say.

namaria1y ago

I have had a developer look me dead in the eye and say "this is not a security concern, because I can't see how this can be exploited".

Security by obscurity from self. It was very hard to explain to that person what was wrong with that line of reasoning.

immibis1y ago

But... You have to see things that way or else literally everything becomes a security concern.. Extra whitespace before a semicolon? I don't see how it can be exploited, but with the mindset you imply, I have to treat it as a security concern. But removing the whitespace is also a security concern.

1 more reply

D-Coder1y ago

> It was very hard to explain to that person what was wrong with that line of reasoning.

"So... you're the smartest person in the world?"

1 more reply

thaumasiotes1y ago· 3 in thread

> The correct way to do something like this will always be parameterized input which looks something like this

> Why? Because [] the postgres protocol splits out the command and the data for the command in a way that can't be injected.

I'm not sure I'm comfortable with this. You can create a prepared statement and then pass user input to it as parameters, sure. https://www.postgresql.org/docs/17/sql-prepare.html

But who says that means your statement can't be injected? It would have to be true that the handling for EXECUTE statements is bug-free. Maybe it is bug-free. Or maybe it isn't. Maybe I can figure out just the right username to cause your prepared statement to have some undesirable side effects.

That wouldn't be SQL injection in the sense of putting hostile values into an SQL query in order to form a different SQL query, but it would be SQL injection in the sense of putting hostile values into an SQL query in order to accomplish nefarious goals via the database. The only real qualm I'd have about calling it "SQL injection" is that it wouldn't be portable across different database implementations; it would be more accurately described as "PostgreSQL injection".

If we feel entitled to assume that PostgreSQL's provided functionality to interpret strings that are provided to EXECUTE statements has no bugs, why aren't we also entitled to assume that PostgreSQL's provided functionality to interpret strings that are provided to the string escaper has no bugs? I don't really see the conceptual difference.

TheDong1y ago

The postgres escape function actually worked fine before this "CVE". It was documented as escaping something for use as part of a postgres query.

BeyondTrust used it as input to the 'psql' tool, which is an interactive tool you're not really supposed to programmatically invoke, and the documentation for the postgres escape function didn't say it escaped input for psql. Even though postgres was fine calling it a CVE and fixing it, I think this is 100% on BeyondTrust for assuming that escaping a string for a postgres query meant it was safe for psql.

If BeyondTrust had just used it as part of a postgres query string, the escape function would have been sufficient.

.... and that's also exactly the reason that using parameterized queries is better. With parameterized queries, the escaping and the query parsing are done in the same place, so there's no chance of confusion for the programming language's string library to get in the way, or for the psql tool's input parsing to re-interpret and alter the escaped string before sending it over the wire.

oefrha1y ago

> If BeyondTrust had just used it as part of a postgres query string, the escape function would have been sufficient.

That's completely false. The following (pseudo code because working with C strings is verbose and beside the point) with nothing to do with psql should be fine:

  PQexec(conn, "SELECT * FROM user WHERE nickname = '" + PQescapeString(user_input) + "';")

but thanks to the vulnerable PQescapeString(), the following user_input

  "\xc0'; DROP TABLE user"

would fuck it up. That's just the failed escape function leading to a classic SQL injection. Using psql makes it worse because psql can execute additional non-SQL commands, but this escape function is not "fine" at all with or without psql.

> With parameterized queries, the escaping and the query parsing are done in the same place

Again, wrong. For parametrized queries, params don't go through serialization because they don't need to hit the parser, there's no "escaping" whatsoever.

1 more reply

thaumasiotes1y ago

> The postgres escape function actually worked fine before this "CVE". It was documented as escaping something for use as part of a postgres query.

> BeyondTrust used it as input to the 'psql' tool, which is an interactive tool you're not really supposed to programmatically invoke, and the documentation for the postgres escape function didn't say it escaped input for psql.

But this is the documentation for psql:

> psql is a terminal-based front-end to PostgreSQL. It enables you to type in queries interactively, issue them to PostgreSQL, and see the query results. Alternatively, input can be from a file or from command line arguments. In addition, psql provides a number of meta-commands and various shell-like features to facilitate writing scripts and automating a wide variety of tasks.

We can learn two things from this:

1. You are definitely supposed to be able to invoke psql programmatically, or else the suggestion to "write scripts and automate a wide variety of tasks" would make no sense.

2. The input to psql is documented as a "query", and it seems fine to assume that a "query" for psql is the same thing as a "postgres query".

voxic111y ago· 1 in thread

This is the article I link my developers whenever I see them making this mistake "Don’t try to sanitize input. Escape output." https://benhoyt.com/writings/dont-sanitize-do-escape/

Its been fairly effective at making them realize the fundamental mistake they are making. Quoting the key part:

> The only code that knows what characters are dangerous is the code that’s outputting in a given context.

> So the better approach is to store whatever name the user enters verbatim, and then have the template system HTML-escape when outputting HTML, or properly escape JSON when outputting JSON and JavaScript.

> And of course use your SQL engine’s parameterized query features so it properly escapes variables when building SQL

TeMPOraL1y ago

> and then have the template system HTML-escape when outputting HTML, or properly escape JSON when outputting JSON and JavaScript

Or, stop using stringly template systems, and treat the data as what it is: a structured language, with well-defined grammar.

One of these days I need to write an article titled "Don't play with escaping strings. Serialize output.". Core idea being, "escaping your output" still looks too much like "sanitizing input"[0], and one tiny mistake is all it takes to give an attacker ability to inject arbitrary code into the page (or give an unlucky user ability to brick the page for themselves) - so instead of working in "string space", work in whatever semantics your output is, and treat the string form as a serialization problem. In case of HTML, that means constructing tree of tags as data structure, and then serializing them. Then, bugs in serializer notwithstanding, the whole class of injection problem disappears - you can't do "<h1>$text</h1>" -> "<h1></h1><script .... </h1>", when your "template" is made of data structures like [:h1, $text], because $text can't possibly alter the structure here. Etc.

In some sense, "Don't escape, serialize instead" is the complement of "Parse, don't validate".

(See also: make invalid states unrepresentable.)

[0] - Who ever sanitizes input? I've only ever seen this kind of sanitization the article describes happen in the output, within string-gluing templates.

gcr1y ago· 1 in thread

I’m starting to think that string-based languages like SQL which mix structure and content are a mistake.

Maybe future database systems will only accept queries serialized from protobuf, or JSON (output by a proper serializer)

immibis1y ago

JSON is a string-based language. So are Java, C, Rust, and so on.

furstenheim1y ago

100%.

Btw, even using psql directly allows binding parameters https://www.postgresql.org/docs/current/app-psql.html

j / k navigate · click thread line to collapse

0 comments

18 comments · 6 top-level

Nijikokun1y ago· 4 in thread

His argument is that you are building on the assumption that this is safe:

    "SELECT * FROM foo WHERE bar=?"
    bindParameter(1, userInput);

But what happens when it turns out that isn't safe?

jerf1y ago

Then you, or someone else, fix the underlying library.

This being Postgres, that process was likely completed decades ago.

btown1y ago

1 more reply

cogman10OP1y ago

That is not what's happening. You can't bind parameters in psql.

    SELECT * FROM foo WHERE bar=$1
    BIND 1 escape(userInput)
    ENDBIND

These yahoos took a different route to running a command than what everyone does and should do and they were bit by it.

EDIT Actually, yes you can bind parameters with psql. However, it's there mostly as a way to test postgres and not something users are expected to use.

graphememes1y ago

They aren't saying that's what is happening but the underlying assumption is similar to that.

oefrha1y ago· 3 in thread

namaria1y ago

I have had a developer look me dead in the eye and say "this is not a security concern, because I can't see how this can be exploited".

Security by obscurity from self. It was very hard to explain to that person what was wrong with that line of reasoning.

immibis1y ago

1 more reply

D-Coder1y ago

> It was very hard to explain to that person what was wrong with that line of reasoning.

"So... you're the smartest person in the world?"

1 more reply

thaumasiotes1y ago· 3 in thread

> The correct way to do something like this will always be parameterized input which looks something like this

> Why? Because [] the postgres protocol splits out the command and the data for the command in a way that can't be injected.

I'm not sure I'm comfortable with this. You can create a prepared statement and then pass user input to it as parameters, sure. https://www.postgresql.org/docs/17/sql-prepare.html

TheDong1y ago

The postgres escape function actually worked fine before this "CVE". It was documented as escaping something for use as part of a postgres query.

If BeyondTrust had just used it as part of a postgres query string, the escape function would have been sufficient.

oefrha1y ago

> If BeyondTrust had just used it as part of a postgres query string, the escape function would have been sufficient.

That's completely false. The following (pseudo code because working with C strings is verbose and beside the point) with nothing to do with psql should be fine:

  PQexec(conn, "SELECT * FROM user WHERE nickname = '" + PQescapeString(user_input) + "';")

but thanks to the vulnerable PQescapeString(), the following user_input

  "\xc0'; DROP TABLE user"

> With parameterized queries, the escaping and the query parsing are done in the same place

Again, wrong. For parametrized queries, params don't go through serialization because they don't need to hit the parser, there's no "escaping" whatsoever.

1 more reply

thaumasiotes1y ago

> The postgres escape function actually worked fine before this "CVE". It was documented as escaping something for use as part of a postgres query.

But this is the documentation for psql:

We can learn two things from this:

1. You are definitely supposed to be able to invoke psql programmatically, or else the suggestion to "write scripts and automate a wide variety of tasks" would make no sense.

2. The input to psql is documented as a "query", and it seems fine to assume that a "query" for psql is the same thing as a "postgres query".

voxic111y ago· 1 in thread

This is the article I link my developers whenever I see them making this mistake "Don’t try to sanitize input. Escape output." https://benhoyt.com/writings/dont-sanitize-do-escape/

Its been fairly effective at making them realize the fundamental mistake they are making. Quoting the key part:

> The only code that knows what characters are dangerous is the code that’s outputting in a given context.

> And of course use your SQL engine’s parameterized query features so it properly escapes variables when building SQL

TeMPOraL1y ago

> and then have the template system HTML-escape when outputting HTML, or properly escape JSON when outputting JSON and JavaScript

Or, stop using stringly template systems, and treat the data as what it is: a structured language, with well-defined grammar.

In some sense, "Don't escape, serialize instead" is the complement of "Parse, don't validate".

(See also: make invalid states unrepresentable.)

[0] - Who ever sanitizes input? I've only ever seen this kind of sanitization the article describes happen in the output, within string-gluing templates.

gcr1y ago· 1 in thread

I’m starting to think that string-based languages like SQL which mix structure and content are a mistake.

Maybe future database systems will only accept queries serialized from protobuf, or JSON (output by a proper serializer)

immibis1y ago

JSON is a string-based language. So are Java, C, Rust, and so on.

furstenheim1y ago

100%.

Btw, even using psql directly allows binding parameters https://www.postgresql.org/docs/current/app-psql.html

j / k navigate · click thread line to collapse