I Went to SQL Injection Court (opens in new tab)

(sockpuppet.org)

1230 pointsmrkurt1y ago435 comments

435 comments

170 comments · 45 top-level

chaps1y ago· 36 in thread

Hi everyone, I'm the plaintiff in this lawsuit. I'm still working on my companion post for tptacek's post! I'll have it ready Soon TM, but feel free to me any questions in the meantime here.

While you're waiting, check out this older post: https://mchap.io/that-time-the-city-of-seattle-accidentally-...

qingcharles1y ago

Matt, you do the Lord's work.

Bear in mind that Matt technically lost this, even with the backing of some of the absolute best civil rights lawyers in the country, Loevy and Loevy, fighting on his behalf. This shows you the absurd difficulty in fighting city hall, especially if you're crazy enough to do it without representation.

The one thing working in our favor is what is proposed in TFA: change the law. Once the state Supreme Court has ruled you're hosed unless you can get an amendment. Illinois has a very strong history of amending its FOIA statute, although a proportion of those changes are to further protect information from disclosure, not always on the side of sunshine.

Another change that needs to happen is strong punishment for bodies who lose these fights. In Illinois this is limited to a "$5000 civil penalty" against the body. What is a civil penalty? It's vaguely defined. They used to throw the money to the plaintiff, but in the later cases I fought they simply awarded the money to the county. As one State's Attorney said to me "I don't care if I lose every case, I just write a check out to myself."

(one final note: be careful what you wish for when you litigate, you can end up with an appellate decision like this that solidifying in law the exact thing you were fighting. It's nobody's fault, but it happens. I ended up with one absurd decision that removed prisoners' rights rather than enhanced them.)

tptacek1y ago

A losing public body is also generally on the hook for attorney's fees, which can be considerable. But the general problem here is that the public bodies are all spending someone else's money, so the real deterrent you have is how much of their time you can credibly threaten to eat up with legal actions.

2 more replies

dataflow1y ago

I don't understand the argument that knowing the column names doesn't help an attacker? Especially in a database that doesn't allow wildcards, doesn't it make things much easier if you know you can do '); SELECT col FROM logins, as opposed to having to guess the column name?

And I don't think I disagree with the court on schema vs. file layouts either. It's not the file layout, but it's analogous: it tells you how the "files" (records) are laid out on the "file system" (database tables). For example, denormalization is very analogous to inlining of data in a file record. The notion that filesystems are effectively databases itself is a well known one too. How do you argue they aren't analogous?

tczMUFlmoNk1y ago

You can always `SELECT table_name, column_name, data_type FROM information_schema.columns`, which is part of the SQL standard. https://www.postgresql.org/docs/current/infoschema-columns.h...

Plus, generally if you have SQL injection, you have multiple tries. You're not going to be locked out after one shot. And there's only so many combinations of `SELECT {id,userid,user_id,uid} FROM {user,users,login,logins,customer,customer}` before you find something useful.

6 more replies

chaps1y ago

The Department of Justice disagrees and voluntarily releases column and table names: https://www.justice.gov/afp/media/1186431/dl?inline=

gwd1y ago

> I don't understand the argument that knowing the column names doesn't help an attacker?

So Kevin Mitnick supposedly did most of his hacking using "social engineering". He'd call up some person, pretend to be in some other department within their organization, and ask them for some specific bit of information he needed to further his attack (or ask them to change some specific thing that would allow him to further his attack).

Would knowing the structure of Illinois governmental organizations help someone perform social engineering attacks against them? Yes, absolutely.

Should Illinois therefore keep the internal structures of their organizations -- the department names and the officials who run them -- secret? No, absolutely not.

First of all, if an attacker doesn't know them, they'll just use other social engineering attacks to figure them out; i.e., hiding the structure doesn't stop social engineering attacks, it just slows them down. Secondly, the value to the public of being able to navigate governmental structures far outweighs the cost of potential attacks.

This seems to me to be a direct analog: The "organizational structure" is the "database schema", and the "willingness to help a random person on the phone who seems to know what they're talking about" is the "SQL injection vulnerability". If an attacker knows the schema, their job is faster; but if they don't know the schema, they'll just use attacks to figure out the schema; so keeping it private doesn't stop an attack, only slow it down. And the benefit to the public of being able to issue FOIA requests far outweighs the cost of potential attacks.

AdamJacobMuller1y ago

> And I don't think I disagree with the court on schema vs. file layouts either.

I disagree that the law should prohibit disclosing "file layouts" but it's pretty clear that the law does block that, and I fundamentally agree with you that schemas are directly analogous to file layouts and thus restricted.

3 more replies

dmurray1y ago

And this part seems self-defeating:

> Attackers like me use SQL injection attacks to recover SQL schemas. The schema is the product of an attack, not one of its predicates”.

If it's the product of an attack, but not the end goal, surely it's of value to the attacker?

It seems clear to me that the statute does, as worded, in principle allow the city not to disclose the database schema - it would compromise the security of the system, or at the very least, it would for some systems, so each request needs to be litigated individually.

The proposed amendment sounds like a good way to fix this - is it likely that will pass?

2 more replies

econ1y ago

If you have an injection friendly application then that is the security problem.

Say someone hacks the db, is the problem easy to guess table names? The column should never have be called "passwords"?

Perhaps 30 years ago that would sound good.

Obscurity should hardly ever be a line of defense. If it is the only defense the problem isn't that it wasn't obscure enough.

Edit:

I'll do you one better. If you so much as suggest that obscurity is good security you actually openly invite people to fool around with your applications. The odds holes are to be found are much better than elsewhere.

1 more reply

ic4l1y ago

I agree with you. Knowing the exact column names can speed up an attack and, in some cases, make it more feasible.

Why don’t they just request disclosure of what’s actually stored and allow renaming of the columns? It seems odd that knowing the exact column names would be necessary if the goal is simply to understand what data is being stored and its intended purpose.

2 more replies

fsckboy1y ago

>It's not the file layout, but it's analogous...How do you argue they aren't analogous?

laws don't get to be analogous

foia request: "I'd like the report the committee prepared about the costs for the new bridge"

response: "denied. the report contains costs laid out in tables with headings, which while not being schemas are analogous, with schemas not being files but being analogous"

mcv1y ago

Yeah, I think it's still useful info for an attacker. But only if the system was actually developed by amateurs who never heard of parameterized queries.

I find it a bit bizarre that the city uses "our system was developed with no consideration for security" as a valid defense.

IshKebab1y ago

'); SELECT * FROM logins --

3 more replies

foota1y ago

Out of curiosity, could you ask for something like "one row of data from every table in the CANVAS database"?

mbreese1y ago

This is a technical solution to a people problem. My reading is that the city doesn’t want to give up this information. If that’s the case, a technical solution wouldn’t work, no matter how easy it is. And given that this has already gone to the Illinois Supreme Court (and lost), the only solution is what is discussed at the end: updating the law.

2 more replies

hathawsh1y ago

Kudos to you for enduring through this fight! We can only achieve transparency when people choose not to be complacent. Thank you.

What do you think are the next steps?

chaps1y ago

My first step is to actually finish my post :)

But after that, getting a reasonable law passed to fix this now-broken nonsense.

doctorpangloss1y ago

What are the administrators of CANVAS hiding?

chaps1y ago

Hard to say. One of my personal drivers for this lawsuit is a tip I received that said that Chicago has a list of vendors whose tickets are dropped in the back-end. When I requested that info, the city said they had no such list. I trust my source, so having schema information could help figure out the extent and if they were lying.

3 more replies

butlike1y ago

'ethnicity' header, 'net_income' header... wouldn't doubt chicago could be cave man enough to do this

maCDzP1y ago

Have you tried looking for information from the developer about CANVAS? With any luck the developer has support documentation online that describes CANVAS and maybe you'll be able to narrow down your FOIA request.

manquer1y ago

I think the point of the lawsuit is less about CANVAS schema itself and more about the ability of the government to hide this kind of information from FOIA requests.

notjulianjaynes1y ago

Damn, this is impressive. I've been fighting with a state agency since December for 17,000 emails. I don't think I've ever tried to request emails and received zero push-back, but a $33 million estimate just, chef's kiss

gwerbret1y ago

Very interesting case! Just one question: to what extent do changes in database schemata fall under FOIA in Illinois? That is, if they should change the database schema to conceal whatever it is they're fighting tooth and nail to hide, are they compelled to retain detailed information about that change? Or can they later present you (should the legislation pass) with a cleaned-up, nothing-to-see-here updated version?

mcnichol1y ago

I don't want to take away any steam from your sails but giving bad information in regards to case law shouldn't be taken lightly. Your "expert witness" did you a disservice.

Schema is very much a critical field in terms of AuthZ privileges. Just knowing the structure is not far off from knowing the max entropy a password may hold. In regards to InfoSec, table structure is the recon phase which limits effort and minimizes time. Someone with that much time in security knows DBs will be hacked, not if but when. Time is an incredibly important tool which is why we have expirations on so many authN and authZ windows of attack.

I'm glad that you are challenging them but I believe a credible engineer would have made mince meat of your expert and hurt the rest of us who want to see you successful.

It's possible rewriting certain statutes can help us but there is no company worth its salt that would share DB schema.

thayne1y ago

> Just knowing the structure is not far off from knowing the max entropy a password may hold

Not if the password is hashed, as it should be. Unless the schema somehow indicates that it uses a hash algorithm such as bcrypt that has a maximum password length. And even then, if they pre-hash the password, the password itself could have more entropy than that. And if there is a maximum password length, then you can probably figure that out via other means, like it being listed in the requirements when you set your password. It does tell you the size of the hash of the password, but if the maximum entropy is sufficiently high, as it should be, then it doesn't really matter; it would still be impractical to brute force.

> there is no company worth its salt that would share DB schema

So you are saying that every company with a self-hosted or open source product that uses a database isn't worth their salt? If your DB is running on a customer's infrastructure, that customer will by necessity have access to the schema. And likewise if the source code for a product is publicly available it is trivial to determine the schema.

1 more reply

ra1y ago

They can produce a report using english language labels instead of the db column names. Their argument isn't fact it's vexatious obstenance.

hennell1y ago

As mentioned in the post FOIA tends to only include existing records/information, it doesn't extend to producing new work. So producing a new report would be considered too much work. (But fighting a lawsuit to not reveal the schema is fine )

waitwhatwhoa1y ago

When can we submit witness slips? Is there a mailing list for updates we can join? Good luck!

hn_user821791y ago

This older post was such a fantastic read, thanks for sharing your story!

layoric1y ago

It's dated from ~2 weeks ago... is there other date information I am missing?

2 more replies

monksy1y ago

What I want to know: How much malort does the city expensive a year?

foota1y ago

> Normally, a flustered public records officer would just reject a giant request for being for “unduly burdensome”… but this sort of estimate is practically unheard of. So much so that other FOIA nerds have told me that this is the second biggest request they've ever seen. The passive aggression is thick. Needless to say, it's not something I'm willing to pay for!

Welcome to Seattle :-)

geoduck141y ago

> that's the second biggest FOIA request I've ever seen!

-Guybrush, from The Secret of Monkey Island

mmaunder1y ago

Thanks for fighting the good fight for us all!

rnewme1y ago

The footer links to dead x account.

SkidanovAlex1y ago· 11 in thread

While I believe that the city should share the schema, and that the city is effectively argues for security through obscurity, I disagree with the main premise of the article: that knowing SQL schema doesn't help the attacker.

If I understand the argument of the author here:

> Attackers like me use SQL injection attacks to recover SQL schemas. The schema is the product of an attack, not one of its predicates

The author appears to imply that once the vulnerability is found, the schema can be recovered anyway. It is not always the case. It is perfectly viable to find a SQL injection that would allow to fetch some data from the table that is being queried, but not from any other table, including `information_schema` or similar. If all the signal you get from the vunlerability is also "query failed" or "query succeeded, here's the data", knowing the schema makes it much easier to exploit.

> the problem is that every computer system connected to the Internet is being attacked every minute of every day

If you specifically log failed DB queries, than for all the possible injections that such 24/7 attacks would find you have already patched them. The log would then be not deafening until someone stumbles on the actual injection (that, for example, only exists for logged in users, and thus is not found by bots), in which case you have time to see it and patch before the attacker finds a way to actually utilize it.

Knowing schema both expedites their ability to take advantage of the vulnerability, but also increases their chances of probing the injection without triggering the query failure to begin with.

florbnit1y ago

> that knowing SQL schema doesn't help the attacker.

Knowing the name of the service helps the attacker, knowing the name of government officials working at city hall helps attackers, knowing the legal description of what a parking ticket is helps attackers. If you are sued and decide you want to hack the government knowing the details of the suit against you helps you in your attack.

The barrier is not “any helpful information must be censored” the barrier is “don’t disclose passwords or code that would divulge backdoors” a schema cannot be that.

Volundr1y ago

I'm not an attacker, just a boring old software dev. If there's an SQL Injection I'd say all bets are off re: schema.

That said I've definitely worked on applications where knowing the schema could help you exfill data in the absence of a full injection. The most obvious being a query that's constructed based on url parameters, where the parameters aren't whitelisted.

So I actually do agree that the schema could potentially be of marginal benefit to the attacker.

butlike1y ago

Wouldn't admitting this in court pin you with some sort of negligence? (if you knew having a schema revealed would compromise your app in some way).

2 more replies

pockmarked191y ago

Reminds me that the recently discovered “leak emails using YouTube” exploit kicked off from reading what is essentially, a schema.

https://brutecat.com/articles/leaking-youtube-emails

robocat1y ago

> kicked off from reading what is essentially, a schema.

I wouldn't call json a schema.

In the HN discussion tptacek replied that "$10,000 feels extraordinarily high for a server-side web bug": https://news.ycombinator.com/item?id=43025038

However his comment assumes monetisation is selling the bug; (tptacek deeply understands the market for bugs). However I would have thought monetisation could be by scanning as many YouTube users as possible for their email addresses: and then selling that limited database to a threat actor. You'd start the scan with estimated high value anonymous users. Only Google can guess how many emails would have been captured before some telemetry kicked off a successful security audit. The value of that list could possibly well exceed $10000. Kinda depends on who is doxxed and who wants to pay for the dox.

It's hard to know what the reputational cost to Google would be for doxxing popular anonymous accounts. I'm guessing video is not so often anonymous so influencers are generally not unknown?

I'm guessing trying to blackmail Google wouldn't work (once you show Google an account that is doxxed, they would look at telemetry logs or perhaps increase telemetry). I wonder if you could introduce enough noise and time delay to avoid Google reverse-engineering the vulnerability? Or how long before a security audit of code would find the vulnerability?

Certainly I can see some governments paying good money to dox anonymous videos that those governments dislike. The Saudis have money! You could likely get different government security departments to bid against each other... Thousands seems doable per dox? The value would likely decrease as you dox more.

1 more reply

tptacek1y ago

If you specifically log failed database queries, where "failure" means "indicative of SQL injection", then nothing you can do with the schema is going to reduce the signal in that feed --- even a single SQL syntax error would be worth following up on. No, I don't think your logic holds.

kmoser1y ago

I don't understand your logic. Knowledge of the schema can give an attacker an edge because they now know the exact column names to probe. Whether these probes get logged is irrelevant; even if it makes the system more vulnerable for an instant, it's still more vulnerable.

Even if logging failed queries is your metric, then knowledge of column names would make it more likely for an attacker to craft correct queries, which would not get logged, thus making your logs less useful than if the attacker had to guess at column names and, in so doing, incur failed queries.

1 more reply

lucb1e1y ago

> nothing you can do with the schema is going to reduce the signal in that feed --- even a single SQL syntax error would be worth following up on

Syntax errors coming from your web application mean there is a page somewhere with a bugged feature, or perhaps the whole page is broken. Of course that's worth following up on?

Edit: maybe I should add a concrete example. I semi-regularly look at the apache error logs for some of my hobby projects (mainly I check when I'm working on it anyway and notice another preexisting bug). I've found broken pages based on that and either fixed them or at least silenced the issue if it was an outdated script or page anyway. Professionals might handle this more professionally, or less because it's about money and not just making good software, idk

1 more reply

wglb1y ago

> "query failed" or "query succeeded, here's the data"

Blind SQL injection is a type where no error is produced, but some subtle signal can indicate success or failure. The most interesting one that I know about is where the presence of a successful injection was a normal looking response that was one byte longer than an unsuccessful injection. This was used to not only figure out the schema, but to fully exfiltrate the entire database.

There is nothing in the log on the server that indicates an error.

Most of the relatively introductory SQL injection exercises that I taught proceed without any knowledge of the schema.

This is why SQL injection is so insidious.

berkes1y ago

Not just with SQLi, but I've managed to statistically proof "information" with timing attacks.

Where if you join another table (by e.g. requesting extra info in a graphql query) the response goes from ms to s or even m. Indicating the size of the joined table.

Or where I could change a "?sort[updated_at]=desc" to a "?sort[password_hash]" through trial-and-error and suddenly see the response time drop from ms to seconds (in this case finding columns that exist but aren't indexed).

Even if the response content is exactly the same, we know things exist, are big, not indexed, or simply present, by timing the attack.

A famous one is obviously the timing trick to find out that an email is in the system because "user = user.find(email) && user.password_matches(password)" short cirquits if the email does not exist but spends significant time on hashing the password for matching it. A big lot of backends and apps make this mistake.

gerdesj1y ago

That's where the court's technical distinction between the words: "could" and "would", is important. It appears they have reduced the distinction to a risk assessment which is more objective than opining wildly!

For example: I've just re-wired a three gang light switch. I verified power on with my multimeter (test the meter), cut the power and then retested all the circuits to make sure I had got it right.

It turns out that switch three is on a separate ring main. Cool I didn't get to test my body's ability to take a whopper of a shock. In the UK it is common to have upstairs and downstairs rings for light circuits. Our kitchen has quite a few lights in it so it got a separate ring as well. Anyway there are quite a lot of wires in there because all of them are two way switches. Oh and I am allowed to work on them because of the switch location - not kitchen and not bathroom, ie a low risk location

I noted down the connections, and took them all out. I put Wagos over the flying ends to make them safe, turned the power back on and got on with the job in hand.

I then cut the power (both circuits) checked again with my Fluke. Oh bollocks ... enable power, test the Fluke and then cut power again and recheck the circuits.

Now I re-terminated all the connections. There was plenty of additional wire so I decided to cut and re-strip the conductors, to make sure that I avoided potential failures due to "work hardening" from the inevitable pushing and pulling and "gentle" forcing into position. Once all the conductors were screwed down I pulled on them fairly forcefully to make sure they wont fall out.

I screwed down the switch face plate and restored power. Its a brushed metal finish switch so I did test it was not live, because I'm careful. I tested the functionality ie all three switch circuits (three) from all the switches (six).

So, given that description is it possible that the connectors might fall out in the future and short on say, the metal back box. Of course it is possible. It could happen but would it happen?

You could postulate all sorts of scenarios. Perhaps I may be careful but I might be cack handed and forgetful and got something wrong anyway and a wire might still drop out. Now we are at the point of whataboutery! and that wont wash.

The would/could distinction is a powerful one and it is analogous to how we do risk assessments.

I'm certainly not saying you are wrong in your assessment but I think you are fiddling with details to conjure up a "could" and not a "would". I agree that knowing the schema would assist a hacking attempt but would it make a successful crack more likely - no I don't think so. It is a classic case of obscurity despite security but a rather more complicated one than putting the ssh daemon on port 2222.

Cripes - I need to get out more!

tptacek1y ago· 10 in thread

Kurt posted this to troll me. Just know my audience here was, mostly, non-technical people involved in politics in my local Chicagoland municipality.

Permit me a PSA about local politics: engaging in national politics is bleak and dispiriting, like being a gnat bouncing off the glass plate window of a skyscraper. Local politics is, by contrast, extremely responsive. I've gotten things done --- including a law passed --- in my spare time and at practically no expense (drastically unlike national politics).

An amazing thing about local politics, at least in a lot of places, is that they revolve around message boards. The boards won't be in places you want to be (in particular: a lot of them are Facebook Groups) and you just have to suck it up. But if you enjoy participating in a community like HN, you can participate in politics, too, and message-board your way towards making things happen.

skissane1y ago

> Local politics is, by contrast, extremely responsive. I've gotten things done --- including a law passed

You live in a country where local governments have the power to make laws… in a lot of other countries they don’t - or, to be more precise, their lawmaking power is extremely limited.

Actually, even in the US, that’s often true too - only local governments with “home rule” can enact laws on any topic (provided it doesn’t contradict state or federal law), those without it can only enact laws on specific topics authorised by the state legislature. Some states grant home rule to all counties and municipalities, others none, others to some but not others (e.g. in Texas a municipality can give itself home rule powers, with approval of its voters, but only once it reaches a population of 5000).

bobthepanda1y ago

Even state legislators are, by their nature, pretty much locally driven given the relatively small size of their constituencies and thus the margin of victory.

Voters significantly underestimate their power even up to the House level; AOC’s first campaign was very scrappy and resulted in a bartender unseating the chair of the Congressional Democrat Caucus and likely successor to Nancy Pelosi, and that was the first campaign in which anyone bothered to primary him.

copypasterepeat1y ago

Would you care to elaborate which law you helped to pass?

Also, can you link to some good resources for someone who wants to get off the sidelines and get more involved in Chicago politics, whether the resources are on FB or elsewhere? I've previously tried Googling for some but with very limited success.

Thanks.

tptacek1y ago

We're the first municipality in Illinois to draft and adopt an instance of ACLU's CCOPS model legislation, which requires board approval at a recorded public board meeting before any agency (most especially our police force) can adopt any form of surveillance technology, given a broad (ACLU-supplied) definition of "surveillance". Previous to that, our police force could acquire arbitrary surveillance products so long as they kept under a discretionary budget threshold; they used that latitude to acquire a pilot deployment of Flock ALPR cameras, and CCOPS was a response to that.

My real goal is zoning.

In Chicago itself, I have less clarity, but am optimistic that somewhere on Facebook is a message board where the staff at your alderman's office reads posts, and the most politically engaged people in your neighborhood argue with each other. That's your starting point (and maybe your ending point). Just go, listen, and chime in with high-effort comments. If you're used to clearing the bar for HN comments, you're way past the threshold of coding like a super-thoughtful person in local politics.

1 more reply

hinkley1y ago

“Never doubt that a small group of thoughtful, committed citizens can change the world: indeed, it's the only thing that ever has.” - Margaret Mead

Y_Y1y ago

Like a hedge fund? Or are we including those committed to violence?

4 more replies

zahlman1y ago

>The boards won't be in places you want to be (in particular: a lot of them are Facebook Groups) and you just have to suck it up. But if you enjoy participating in a community like HN, you can participate in politics, too, and message-board your way towards making things happen.

How do you figure out where to go?

tptacek1y ago

The way you'd expect: I bumbled through a bunch of different Facebook Groups, starting with the one simply labeled for my neighborhood, and followed cross-posts. Eventually I found the two really important ones in my area (one is an organizing group for local progressives --- I live in a very blue muni, and the other is the main high-signal political group for the area, in which all the village electeds participate).

AceJohnny21y ago

> (drastically unlike national politics)

Man, I remember your & Maciej's effort to get FIDO keys to the campaign staffers, and how depressing that was

chaps1y ago

Aaaaaaa! I need to finish my post! :(

pavon1y ago· 9 in thread

Great read. Frustrating that the court ruled that a schema was a file layout, since I don't think it is, but at the same time if it didn't fall under that exception, there is a strong arguments that would be considered "documentation pertaining to all logical ... design of computerized systems". A schema is literally, the logical design of the database, and the database is a part of the computerized system. Once it was ruled that those examples are "per se" exempt it was a long shot to argue that schema wasn't covered by any of the examples.

gregw21y ago

I completely agree with you that (unlike/despite the Supreme Court ruling), database table/column schema design (and other system designs) should fall under the Illinois statute as "documentation pertaining to all logical and physical design of computerized systems". It's interesting that the law did pick up on that distinction between logical and physical design but none of the parties described in this article did. Logical/physical designs are not just about servers and integrations, they are also about data.

I'm not sure why that wasn't argued by the state and the state argued the database schema was a "file format". Per my reasoning, the state still would have won, but for different reasons.

I disagree with you slightly however and would say that the schema table/column names should be considered not logical but "physical design" while the business naming/meaning of tables would be a "logical design" (or conceptual design). See Wikipedia: https://en.wikipedia.org/wiki/Logical_schema

SQL injection is really about physical schema designs, not logical ones (I do get that every bit of information including business naming of tables/columns helps in an attack, but it does change the degree of threat and thus the balancing tests of the risk which are relevant per the definitions and case law described in the original article.)

So in terms of what the law /SHOULD/ be, the law should not include logical design as a security exception, only physical design. It /SHOULD/ be possible for citizens to do FOIA requests and get a logical understanding of all the database fields without giving them the SQL names that can accelerate SQL injection attacks. In that way citizens could ask for the data by a logical/business-named handle rather than a physical one.

And the state should create logical models or provide data dictionaries with business (not technical terms) on request as part of their FOIAable obligations to their citizens for the data they are maintaining.

My 2 cents as someone designing database schemas for 25+ years.

hot_gril1y ago

Schema is definitely software, a operating protocol, source code, and file layout. Maybe also documentation.

tptacek1y ago

A schema isn't software in the sense imagined by the ILGA. If it was, every Excel spreadsheet would be too, and Excel spreadsheets are the basic currency of FOIA.

An "operating protocol" is a step-by-step list of things to accomplish some action. It's a finite state machine for humans. Obviously, a schema isn't that; a schema is declarative, and an operating protocol is imperative.

The court definitively established that SQL schemas aren't source code in the sense imagined by the ILGA. SQL queries can be. Schemas are not.

See downthread for why a schema isn't a file format. In fact, a schema is almost the opposite of a file format.

A court will look at the term "documentation" in the ordinary sense of the word; as in, "a prose description and set of instructions".

"Associated with automated data processing operations" isn't an element in the statute; it's a description of all of the elements.

2 more replies

pavon1y ago

I think a schema will definitely be part of the source listing, either in the main programming language source code or in a some other file used to define or initialize the database. But I don't think it is software, any more than a protocol is software. Software does something.

One tricky aspect of this is that even if the schema itself as a higher level concept doesn't fit into any of those definitions, all existing instances of the schema are likely considered either source listings or documentation. So the instances are barred from release per se, and you can't ask the government to create new documents.

1 more reply

paulddraper1y ago

How is a database schema not a file layout?

kasey_junk1y ago

The article describes why. 2 different db engines (or even instances) can use different file layouts for the same schema.

In many was sql is all about divorcing the schema from the files.

3 more replies

hyperpape1y ago

It literally does not describe a file, and does not literally describe the data layout of anything on disk (though with enough knowledge, you may be able to infer facts about probable layouts).

1 more reply

dools1y ago

The schema describes the database layout. The file layout (if you were going to call it that) in a modern RDBMS would describe how the RDBMS implemented a particular database layout as described by the schema.

michaelmrose1y ago

Because it doesn't describe how data is laid out on disk.

1 more reply

Y_Y1y ago· 7 in thread

Is it not absurd that the supreme and appeal courts disagreed on a syntactical matter? Never mind that this isn't uncommon, or that (IMHO) it would be ridiculous to interpret it as "any file layouts at all, and other stuff too, but only bad other stuff". It's crazy to me that were happy for laws to sit on the books being utterly ambiguous.

I know this suits the courts who benefit from the leeway, and that (despite valiant efforts) we're not going to get "formal formal" language into statutes. I know that the law is an ass. I know that the laws are written by fallible and naive humans.

Even after all that, if the basic sentence structure of what's in the law isn't clear to the courts, hasn't the whole system fallen at the first hurdle?

copypasterepeat1y ago

I am not a lawyer, but my understanding is that's just how the justice system works. Reasonable people can disagree about what exactly a complicated statement says, since language is full of ambiguities. People have been discussing what the U.S. Constitution says exactly from the day it was written and there are still a lot of disagreements.

The standard response to this is that laws should be written in ways that are non-ambiguous but that's easier said than done. Not to mention that sometimes the lawmakers can't fully agree themselves so they leave some statements intentionally ambiguous so that they can be interpreted by the courts.

kmoser1y ago

Nobody reasonably expects all laws to be written completely unambiguously. But since laws (and indeed all manner of legal documents) are filled with lists and modifiers, I don't think it's unreasonable to require that they be written to a certain standard which defines how these lists and modifiers should be interpreted, similar to RFC 2119 https://microformats.org/wiki/rfc-2119.

skissane1y ago

I’ve often thought we’d get more sensible results in court cases on computer-related issues if we had specialised courts where the judges were required to have a relevant degree (computer science, software engineering, computer engineering, information systems, etc). But I doubt it is going to happen any time soon.

2 more replies

Xelynega1y ago

Correction, that is how common law legal system works.

Alternatives like codified law exist and are practiced, just not in the US or Canada.

tptacek1y ago

To me it feels like the kind of dispute that is exactly why we have multiple levels of appeals court. The "file format" thing is super dumb, and they got it wrong, but the "that if disclosed" statutory interpretation is a thing that seems important to get a final, consistent determination on.

Y_Y1y ago

Of course I can't disagree that it's good that it's now settled. Still I can't help but imagine a world where the meaning, at least in terms of which words apply to which others (rather than qualifiers like "reasonable"), should be settled before the law is debated, voted on, and passed.

Even (some) programmers have learnt the dangers of parsing at run time (e.g. "eval is evil"). How can we decide it's the law we want if we don't know what it means yet?

2 more replies

olau1y ago

I find it slightly odd that you get hung up on the file format thing. The law as you quoted it says "including but not limited to" and the first example given is then "software".

EMIRELADERO1y ago· 5 in thread

Am I the only one slightly perplexed/worried by the point-blank source code exemption?

It's easy to imagine a scenario where the city decides to develop a specific software in-house and hide the "biases" in the source code, or any other thing one might not find desirable.

Hell, they don't even need to make everything from scratch! Could just patch and use a permissively licensed 3rd-party component.

In my opinion, the proposed amendment does not go far enough.

manquer1y ago

It shouldn't be surprising ?

It is the same problem people trying to open sourcing closed projects experience, there is all sorts of locked-in proprietary code which the developer and the customer only have the license to use but not share the source.

Even projects which from day one are staunchly open and built without direct commercial interests like government contractors need also suffer from this. The Linux kernel challenges for supporting ZFS or binary blob drivers in kernel/user space and so on are well known[1]

Paradoxically on one hand information wants to be free, and economics dictate that open source software will crowd out closed competitors over time, it is also expensive to open source a project and sometimes prohibitively so and that deters many managers and companies open sourcing their older tools etc, even if they would like to do so, involving legal and trying to find even the rights holder for each component can deter most managers.

If a government put requirements in contracts that the vendor should only use open source components in their entire dependency tree, it could drive the costs very high because a lot of those dependencies may not have equivalent open source ones or those lack features of the closed ones so would need budgets to flesh them out. In the short term and no legislature will accept that kind of additional expense, while in long term public will benefit.

---

[1] yes kernel problems are largely a function of GPL, more permissive licenses like Apache 2 /MIT would not have, BSD variants after all had no challenges in supporting ZFS.

However a principled stance on public applications being open source by government would be closer to GPL than MIT in terms of licensing. Otherwise a vendor can just import the actual important parts as binary blobs "vendored" code and have some meaningless scaffolding in the open source component to comply.

Y_Y1y ago

Maybe FOIA should trump licensing in this case. Suppose I write a manual on how to issue bad parking tickets and hide them in a database, and then license that (in since restrictive manner) to the state of Illinois. I think the public's right to see that document is more important than my right to prevent copying and dissemination.

1 more reply

contravariant1y ago

In theory the decision to put those biases in the code should be public information. You can ask for the criteria the software was made to, just not the software itself.

Though rulings like this might have a chilling effect.

qingcharles1y ago

Only if they are written down. For instance, DOGE makes sure everything is done by voice so there is nothing to catch them out on in future. I've found that once you start hitting a public body with FOIAs regularly they learn to stop putting incriminating things down in writing.

dotdi1y ago

That's why it's important to push for "public money - open source" initiatives like some countries in the EU are trying to implement.

Off the top of my head, I think the last (now failed) German coalition had this in their programme but didn't deliver. Maybe the new government will.

duxup1y ago· 5 in thread

Very interesting read.

It does seem absurd to think of divulging schema as protected, as described it allows for a magical sort of outcome where: "well it's in a database you can't know anything about, and if you can't tell me how to find it you're sol".

Working at a small company with lots of clients I wouldn't want to hand out DB schema outright, but I also go out of my way to search / get the client the data they want ... not reject them.

rectang1y ago

A private company wouldn't want to divulge their DB schemas because it's advantageous for competitors to see how you're doing things. That doesn't apply to government databases.

chaps1y ago

Not quite, and the details get hairier the closer you look. The database in-question here is an IBM system. The database itself is used for government functions, making it FOIA'able, despite it being managed by a third party company. IBM even tried to argue that the schema was trade secret, but the statute isn't straight forward. Here's my (successful) response when they tried:

You mentioned on Thursday over the phone that IBM is not too keen on having its database schema released, and, between IBM and Chicago, is seeking an exemption under 5 ILCS 140/7(1)(g) - an exemption that is only valid if the release of records would cause competitive harm. This email preemptively seeks to address that exemption within the context of this request in the hopes of a speedier release of records. It is FOI's belief that there is little room for the case for the valid use of 5 ILCS 140/7(1)(g) when considering the insignificance of the records in conjunction with the release of past documents:

1. Chicago released CANVAS's technical specification [1] seven years ago. To the extent that the specification's continued publication does not cause competitive harm, it is very unlikely that the release of CANVAS's database schema would cause any harm. 2. The claim that the release of a database schema would cause competitive harm is not unlike suggesting that the release of filing cabinets' labels can cause competitive harm.

Furthermore, in your response, please be mindful that the burden of proving competitive harm rests on the public body [2].

[1] https://www.cityofchicago.org/content/dam/city/depts/dps/Con... [2] http://foia.ilattorneygeneral.net/pdf/opinions/2018/18-004.p...

bob10291y ago

The schema on the last project I worked on was probably our most important IP. Specifically, the ways in which we solved certain circular dependency issues.

I wouldn't take the ability to design a schema for granted. I don't think many people are any good at it. Do not underestimate the value of your work products.

1 more reply

hinkley1y ago

Part of the reason I’m so… enthusiastic… about tech debt is that I’ve worked a few times where we had a competitor whose lunch we were stealing or who was stealing ours and the ability or inability to copy features cheaply was substantially the difference between us.

That quad graph of value versus difficulty that everyone loves? It’s not quadrants it’s a gradient and the difficulty dimension depends quite a bit on context. What’s a 4 difficulty for me might be a 6 for someone else. Accidental versus intrinsic complexity plus similarity to or distinctions from things we have already done.

bornfreddy1y ago

Maybe. But now I'm really curious how bad that schema must be for them to hide it so viciously.

4 more replies

aqueueaqueue1y ago· 5 in thread

Interesting takeaways from me:

All that pompous sounding legalese can still be ambiguous! I feel less bad for not understanding contracts that have 100 word compound sentences.

Legal people can't keep up with our tech jargon but they have their own jargon including "predicate" lol. So same logical thinking, different jargon framework.

Question: why do they want the schema not the data?

tptacek1y ago

Because once you have the schema you can issue FOIA requests that include queries for them to run.

darkarmani1y ago

Is the schema considered private information or just information not required to be released via FOIA? ie: Can't some nice employee leak this information or is it legally protected?

Once the information is released, can anyone can make FOIA requests using the schema?

1 more reply

hot_gril1y ago

What if you guess common table names? Wonder if they send back the error message.

aerzen1y ago

Could you ask them to run an introspection query? Something like SELECT * FROM information_schema.tables?

aqueueaqueue1y ago

Oh wow! If that is necessary, that is so kafkaesque!

"I want your data"

"What data?"

"What do you have?"

"Ha ha. No. Tell me what you want"

"Your data that is the metadata of your data"

"Well actually..."

...

1 more reply

Terr_1y ago· 4 in thread

> Each spreadsheet has a header row, labeling the columns, like “price” and “quantity” and “name”. A database schema is simply the names of all the tabs, and each of those header rows.

This is also how I explain it to my relatives, I'm kind of surprised this analogy (one so direct that it's almost literal) didn't fly with the judges.

If database column names cannot be revealed, then shouldn't that mean the state is also able to redact the headers of all their spreadsheets?

kmoser1y ago

Knowing a spreadsheet header doesn't help an attacker gain access to that spreadsheet in any way. Knowing SQL column names may give an attacker an advantage in accessing a database.

Terr_1y ago

Compare: "Knowing the writing style of current employees may give an attacker an advantage while phishing, therefore, we cannot turn over any memos or emails whatsoever."

Ditto for the org-chart.

flutas1y ago

Per the post, this also wouldn't fly.

> Believe it or not, there’s case law on “would” versus “could” with respect to safety. “Could” means you could imagine something happening. But the legal standard for “would” is “clear evidence of harm leaving no reasonable doubt to the judge”. The statute set the bar for me very low and I managed to clear it.

1 more reply

butlike1y ago

It's a reverse vlookup

DangitBobby1y ago· 4 in thread

When a law is ambiguous by wording, why do they never ask the people who drafted the law what was intended?

jaza1y ago

That would be against the separation of powers doctrine inherent in all Western democracies. The job of the legislature is to write the law. The job of the judiciary is to interpret the law.

Besides, when the law is ambiguous, it's very often because the legislature themselves weren't sure what they intended, and/or because the legislature had deeply divided views and arrived at ambiguous wording as a compromise, and/or because the legislature used their "somebody else's problem" prerogative i.e. they said "let's leave that for the courts to decide". Ambiguously worded laws isn't a bug, it's a feature!

DangitBobby1y ago

I don't see how it could break separation of powers, especially if a legislator could provide minutes and/or a paper trail of discussions and revisions pointing the intent in a certain direction. You know, like evidence. The legislature surely has intent while writing the law, otherwise what would be the point in trying to interpret it, and the whole thing being litigated is the authors intent. I don't think the separation of powers doctrine presupposes that the legislature has no idea what their goals are while writing laws, that would be quite an insane assumption to bake into our system, and broken by design. And in this case, I very much doubt it was left intentionally ambiguous, since FOIA was clearly intended to help people get information from obstinate government agencies. What would even be the point in writing the law if obstinate government agencies are supposed to be able to weasel around the ambiguity behind a comma? Regardless, if we are able to ask the people who spent time drafting it, we could ask. There might even be a paper trail!

1 more reply

tptacek1y ago

The current sitting ILGA is not the ILGA that passed the statute.

DangitBobby1y ago

They are probably still alive, shouldn't be that hard to find. They have no problem giving subpoenas to other witnesses or soliciting expert testimony.

wswope1y ago· 4 in thread

Anyone with a legal background willing to opine about potential workarounds to this ruling?

Specifically, would a request for “data field labels” (i.e. a column list without any table structure info) likely circumvent the exemption?

gpm1y ago

I think that would run afoul of

> The one big limitation of Illinois FOIA (with FOIA laws everywhere, really) is that you can’t use them to compel public bodies to create new records.

Unless for some reason they already had a list of columns without table structure.

(Not that I claim to have a legal background)

wswope1y ago

I had that thought too, but my naive rebuttal would be that the column data already exists by default in any standard RDBMS as information_schema.columns. No new record creation required.

1 more reply

duxup1y ago

Yes but what if we come up with a directive that every FOIA request must be logged into a DB. Therefore every request is automatically invalid as it requires we create a record!

1 more reply

Andys1y ago

Not a lawyer, but why not use opensource as an example? Many successful public e-commerce websites have public schemas and aren't all hacked.

ajkjk1y ago· 4 in thread

This was fine, legally, but I'd be pretty irritated if someone I knew wasted everyone's time on this. The schema clearly is (marginally) useful for hacking, but who cares; it clearly is a file layout also, but who cares; those matter legally but not morally. Morally, this is just dumb: it's not something they really needed, and they're just irritating people and wasting resources for the fun of it. Shameful.

tptacek1y ago

No. I'm involved in local government, and on the citizens commission where we keep track of our our municipality (adjacent to Chicago) stores and manages information. I'm acutely familiar with how people are spending their time in these organizations, and what is and isn't a big lift for them.

Increasingly, year over year, more and more information that would previously have been stored in filing cabinets or shared drives is moving into turnkey applications that municipalities buy and enroll all their data in. Those applications are opaque. But almost all of them are front-ends to SQL databases.

Being able to recover schemas from publicly operated databases is vital to keeping public records and data public, rather than de-facto hidden from inquiry.

Matt's suit was anything but a waste of people's time. Hopefully, it'll result in a change to our state law.

hot_gril1y ago

Just because the article gets into fine details doesn't mean it's silly. They're working with what they have.

But after reading more, I agree. The point of FOIA in the first place was "access by all persons to public records promotes the transparency and accountability of public bodies at all levels of government." Not "pushing FOIA statutes to their limits, sniffing out buried data and bulk-extracting it with clever requests."

If he's just asking for his own parking ticket records, ok. This isn't in the spirit of that. Separately, I agree that the SQL schema is software, a type of file layout, marginal attacker benefit, and other things in that exemption, and I'd say that again as an expert witness.

zonkerdonker1y ago

See here: https://news.ycombinator.com/item?id=43176625

FOIA requester responded in comments saying they received a tip indicating illegal practices, and noted in his article that he had previously uncovered evidence of over-policing in black neighborhoods.

jbritton1y ago

I think a file layout describes the exact arrangement of bytes in a file. A schema is higher level. It describes what is stored, not how it is stored. A database could be one file, or a file per table, or a file per column. Data could be stored across multiple drives.

bobsmooth1y ago· 3 in thread

What stands out to me about this article is the time between court appearances. Seems like if you want to accomplish anything in court you need to be prepared to spend years of your life on it.

rectang1y ago

And of course, people and entities (private or as in this case public) who have a lot of resources take advantage of that, a state of affairs which often serves to perpetuate injustice indefinitely.

1 more reply

lucb1e1y ago

Can confirm this is the case everywhere. Even before taking anything to trial, one can spend months on trying to come up with a mutually agreeable solution, in my case getting seemingly one step further each time¹. I'm not sure I'd not just give up and move on with my life if this dragged on for years and wasn't about something that majorly impacts my life or that of a loved one

¹ Details: it was a warranty case, so first they agreed to repair it, then they didn't do that (but maintained that they were going to, whenever I asked about the status), then they agreed to refund, then they didn't do that, then I set a deadline, they iirc agreed, then they didn't pay, then I included specifics of what my next steps would be (lots of research here, seeing what even my options are and what I can truthfully claim that won't get shot down by a judge later) if they didn't pay before some other deadline (so I showed I was serious now), then the deadline crept up and they finally refunded the day before it would expire and I was frankly disappointed because, by now, I was prepared and ready, and all I got was the original sum that I had paid them. I checked the legal interest rate and changing my demand to include that simply wasn't worth wasting more time on this, and I didn't find any sort of precedent that I could bill any time I provably spent, not even to the value of minimum wage, so any time you invest is just lost free time (which I didn't have much of during that particular year). Protip: scroll down the reviews before buying something worth more than a few tenners from a small store. I wasn't the first person who had to threaten litigation...

barbazoo1y ago

I thought the same thing. Sure it's async but still you have to keep this in your mind for a very long time.

jaxgeller1y ago· 2 in thread

I FOIA'ed >1M pages of docs for my project cleartap.com, a DB of water quality of the USA.

Most states would charge a small amount to gather the documents.

Michigan wanted $50K to for the FOIA request. I think because of the Flint lead crisis. They wanted me to go away.

davethedevguy1y ago

I noticed that you do have data for Flint. Did you have to pay it, or is there some appeals process if you're quoted an unreasonable amount?

Great project by the way!

jaxgeller1y ago

Ended up finding the majority of Michigan through scraping.

For example, https://www.cityofflint.com/wp-content/uploads/2023/06/Annua...

inetknght1y ago· 2 in thread

> You also generally can't FOIA the source code of programs they run.

Alas, that part should be illegal under FOIA.

Source code should be open source and verifiable. Being exempt from FOIA circumvents public confidence in the government's use of software.

I'd be curious to learn if/where courts have decided such things already.

jaza1y ago

I assume that - even though there's a strong public interest argument for it - government orgs are prone to blanket banning the release of source code, for the same primary reason that businesses are prone to doing so. That is, too high a chance of sensitive data (passwords, tokens, IP addresses, etc) being hard-coded in all-too-often non-12-factor-aspiring code; and too much security / liability headache if said sensitive data gets out.

There's probably also some actual business logic that government orgs want to and are legally permitted to keep secret. In the OP's case of a parking ticket database, maybe there's software talking to that database, whose source code includes the logic of picking when / where parking inspectors should conduct a "random" blitz of issuing fines.

inetknght1y ago

> maybe there's software talking to that database, whose source code includes the logic of picking when / where parking inspectors should conduct a "random" blitz of issuing fines.

Oh yes, and that "random" blitz of issuing fines definitely doesn't have any racist part to its algorithm. Just trust the government on that one. The government and the "business" what wrote the code in the first place. Yup, makes sense.

Jean-Papoulos1y ago· 2 in thread

I understand freedom of information, but what exactly does the public gain by Matt getting the database schema ?

If the answer is "the ability of the request data from a specific table/column", I would say that this should possible to do by asking for the relevant data directly (instead of asking for "the timestamps of each ticket" ask for the "time-related data of each ticket" for example) ?

And yes, having your db schema out in the wild can be a vector of attack, if only because it allows targeting the sql injections (the blog author himself argues this in court).

The court was right to reject this. Maybe the exact word of the law doesn't ask for it, but the spirit certainly does.

gizmo1y ago

Municipalities obstinately refuse reasonable requests because they resent that the Freedom of Information Act allows regular civilians to get all up in their business. The excuses they make for noncompliance (it's burdensome! it violates privacy! sql injection!) are not serious. They don't want to comply because they don't like accountability. That's it.

tptacek1y ago

The blog author argued no such thing, because that is not true.

lcnPylGDnU4H9OF1y ago· 2 in thread

> where the only way to get at the underlying data is to FOIA a database query

Was this ever attempted?

  SELECT * FROM `information_schema`.`tables`;

chaps1y ago

Yep, that was done in the FOIA request related to this lawsuit:

  select utc.column_name as colname, uo.object_name as tablename, utc.data_type as type
  from user_objects uo
  join user_tab_columns utc on uo.object_name = utc.table_name
  where uo.object_type = 'TABLE'

https://www.muckrock.com/foi/chicago-169/canvas-database-sch...

lcnPylGDnU4H9OF1y ago

Yeah, it's obvious the double standard here, then. Curious indeed why they are so adamant to keep the schema/data secret.

3 more replies

neilv1y ago· 2 in thread

> [...] where the only way to get at the underlying data is to FOIA a database query.

Can you request the desired information using natural language, based on your guesses of what information they store?

tptacek1y ago

Probably not, because then you'd be asking them to go do research. You FOIA for specific documents and records.

neilv1y ago

So you can ask for the document that is the inspection report from Mel's Diner on date 11/11/2024?

Can you ask for the database record from dispatching that inspection visit to Mel's Diner on 11/11/2024, even if you don't know the exact database column names and relations?

If you can ask for that one dispatch database record, without knowing the schema, can you ask for the database records for all inspection visits to all locations in Smallville in 2024? (Or does the complexity of that database query constitute "research"?)

1 more reply

gowld1y ago· 1 in thread

This is part of what discouraged me from going to law school. So much of litigation is Kabuki theater, grant rhetoric not in any way intended at achieving a just or logical outcomes, but designed only to the person in power an excuse to decide however they had already wanted to decide before the case was tried.

lucb1e1y ago

> So much of litigation is Kabuki theater, grant rhetoric not in any way intended at achieving a just or logical outcome

Agreed, that is what this sounds like. What stood out to me is the remark »“only marginal value” is just self-important message-board hedging«: it's also simply correct, but the author concluded that they shouldn't have said it because "marginal" plus a bunch of explanation didn't have the rhetorical value that "no" would have had

Someone could legitimately configure a WAF-like system to scan for various ways of querying the database schema coming in as HTTP requests (keywords like "information_schema", encodings thereof, etc.), which will always be hacking attempts and can be blocked. If you already have the schema, you can craft a query without needing to bypass that restriction first. Is this likely to be a serious barrier at all? No. Is it anything to do with self-importance? I don't see how that's the case, either. It seems simply correct that this is marginal (situated in the margins, not the point, not important to discuss), but by saying nothing but the truth, now the other side blows that up to something much bigger and tries to get the court to agree that, "see, their own expert says it has value!" And so this expert concludes that they shouldn't have said it, that they should have just said "no value" which I would say is wrong, but so marginally wrong that it's hard to prove for the opposing side that it is not fully correct, and thus being less correct helps you in (this) court... so it's about rhetoric as much as being an expert...

probably_wrong1y ago· 1 in thread

Random thought: someone should drive to Chicago, get a parking ticket, and then make a FOIA request for all of their information contained in that database.

It won't be the whole database schema, but it would be a start.

chaps1y ago

Short answer -- already been done.

This (spoiler) visualization's going into my eventual post about the lawsuit: https://observablehq.com/d/026992341cc47ff0

alexashka1y ago· 1 in thread

Wowzers, that was a lot of words to express something that's very simple.

A database schema is just an empty form. By looking at an empty form, you know what fields have be filled in, what type of information they'll contain, etc.

Of course people making data requests need to know what forms are being used to collect and store information.

As for security - not letting people do anything because 'it might be dangerous' is bonkers. The way to secure databases has been known for decades. Let's start living in the 21st century :)

tptacek1y ago

The whole back half of the post is about why the analysis is not as simple as you suppose it is. We had no trouble establishing at Chancery Court that schemas don't endanger security. That's not why the case failed at the Illinois Supreme Court. The IL Supremes did not decide spontaneously that schemas actually are dangerous.

1 more reply

hnthrow903487651y ago· 1 in thread

>just self-important message-board hedging

I can confidently say it does not stop at message boards for many people, self included

tptacek1y ago

It's a real issue when writing an affidavit or testifying. Lots of ingrained bad habits.

koolba1y ago· 1 in thread

> [Public bodies] shall provide a sufficient description of the structures of all databases under the control of the public body to allow a requester to request the public body to perform specific database queries.

I sure hope the impact of this is not that government entities switch to schema less databases!

CharlesW1y ago

"Schemaless" is like "serverless" in that there's always a schema, even if it's not enforced by the database and instead applied dynamically by the application layer.

rafram1y ago· 1 in thread

How were you able to stand as an expert witness when you have a personal relationship with the plaintiff? I don’t know the specifics of the law in Illinois, but my understanding is that that would generally be a disqualifying conflict of interest.

hondo771y ago

I have this cousin, Vinny, who's a lawyer, and he was able to use his girlfriend as an expert witness. Both sides agreed she really knows her stuff because that's what really matters.

scotty791y ago· 1 in thread

> Does the “would jeopardize” language in the statute apply to everything in the exemption, or just to the nearest noun “any other information”?

I think law and lawmaking would be vastly improved if only lawyers learned the miracle of parentheses.

Ylpertnodi1y ago

Comma's can be expensive, too.

gervwyk1y ago· 1 in thread

Should have used mongodb in the first place.

qbxk1y ago

lol'd so hard at this

dylan6041y ago

"Retrieve the data of every parking ticket issued to ‘Bob O’ and also all the rest of the information in the database including everyone’s passwords."

This is the example of SQL Injection written in plain English, yet "everyone's" is problematic here in that it's an orphaned single quote. If "Bob O'Conner" is bad, so is "everyone's"

kingforaday1y ago

Given the Illinois Supremes decision, seems like an opportunistic time to say "Everything is a file".

1. https://en.m.wikipedia.org/wiki/Everything_is_a_file

boxed1y ago

> Unfortunately, the Illinois Supreme Court had at their disposal a second dictionary. In the Merriam-Webster Online Dictionary, a “schema” is defined as “a structured framework or plan: outline”. “This is a difference in name only”, said the court. Argh. Schemas are now file layouts. We lose.

This is really bad. Words have different meanings in different domains. You can't just point to a dictionary definition for the wrong domain. This is absolute madness and should be grounds for termination as a judge. Imagine how angry that judge would be if you did that for some random legal jargon that is very different from the common definition of a word!

rubymancer1y ago

It's Matt Champan! https://mchap.io/

I helped him process and visualize the original batch of parking ticket data waaaay back in 2016.

I can't believe he's still on this in 2025. We need more junkyard dogs like him fighting for what's right.

djeastm1y ago

I suppose I need to change all my column names to random 16-character strings so I don't leave my database insecure!

indymike1y ago

There is no fredom of information if the public is not allowed to know what data the government has.

lq9AJ8yrfs1y ago

In the new language proposed in SB0226 (as linked, didnt search for authoritative sources, can't tell how durable that link will be for posterity, arrgh archiving the web is hard etc), doesn't that language leave open a hole for excessive complexity to be a reservoir for FOIA resistance?

Feels like there is an important theme here that SB0226 is dancing around --could government be legible in addition to being "plain-text" transparent?

"plain-text description" of "each field of each database of the public body" and "specific database queries" may not do what you mean.

Not sure how to fix it though.

I could see gratuitous ORMs and database-of-databases patterns winning tax dollars with taunt-them-with-the-schema listed as a feature.

thayne1y ago

I'm confused why file layout is included in the list of exceptions in the first place. If an adversary knowing your file format is a security problem, then you are doing something very wrong!

And with the ruling that the condition only applies to "other information" (which to me seems like a very strange reading, and probably not the intent of the law), regardless of if a SQL schema is considered a "file layout", creates a massive loophole, where the government can just use some obtuse custom file layout to avoid FOIA requests.

makach1y ago

Does disclosure of a database schema really jeopardize the security of the system? Yes

How plausible or likely does that jeopardy need to be? Very

Does a database Schemas constitute “source code”? Yes

Is a SQL schema a “file format”? No & yes. In that order.

And, finally, does the “would jeopardize” language apply to everything in the exemption, or just to the nearest noun “any other information”? Yes

irrational1y ago

> I’ll conclude this long piece by saying (1) obviously the bill should pass, and (2) it should be called “The Chapman Act”.

(3) I imagine Chicago greatly regrets towing Matt Chapman "over a facially bogus ticket".

gavin_gee1y ago

https://x.com/JackRhysider/status/1885732851779285184

b81y ago

Got to see this happen day by day on the Midwest Venture Partners Slack. There was another lawsuit Chappman and Tom did for laser based speed detection in Chicago.

pudding123451y ago

Do stored procedures count as part of the schema? I've recently found a SQL injection vulnerability in a client's SP that was using concat (very badly)

ngriffiths1y ago

> Congratulations! You now understand databases.

Data engineering: doing a lot of fancy work to make a very simple product

el_snark1y ago

Enjoyed the read. Good luck with the future developments.

Now a nerdy question. As someone who investigates SQL injections, why are you running a server based on nginx 1.4.6? Do you know something I don't? :-)

abfan11271y ago

am I the only disappointed there's no mention of little Bobby Tables?

gunian1y ago

sql injection court seems more fun than slave court where they tell you spending anything above 5 is a crime lmaooooo

lucb1e1y ago

I got to about 1/3rd of the way before I noticed my eyes were kinda struggling to read the article. Toggling different CSS rules, it's the #333 gray color. Turning that off is instantly better. The custom font is much thinner than the default, but that by itself doesn't seem to be the issue if the color is (closer to) black. (There is also a font-weight rule, but toggling it makes no visual difference in Firefox. Maybe the text is intended to look different?)

Since there is no contact method on the website, figured I'd mention it in a comment; hope this helps

1 more reply

lubujackson1y ago

Juxtapose this legal process with DOGE hoovering (in more ways than one) data willy-nilly from everywhere. The dissonance between THIS uninteresting DB schema being so rigorously protected while massive amounts of sensitive data is completely misappropriated is painful.

j / k navigate · click thread line to collapse

435 comments

170 comments · 45 top-level

chaps1y ago· 36 in thread

Hi everyone, I'm the plaintiff in this lawsuit. I'm still working on my companion post for tptacek's post! I'll have it ready Soon TM, but feel free to me any questions in the meantime here.

While you're waiting, check out this older post: https://mchap.io/that-time-the-city-of-seattle-accidentally-...

qingcharles1y ago

Matt, you do the Lord's work.

tptacek1y ago

2 more replies

dataflow1y ago

tczMUFlmoNk1y ago

You can always `SELECT table_name, column_name, data_type FROM information_schema.columns`, which is part of the SQL standard. https://www.postgresql.org/docs/current/infoschema-columns.h...

6 more replies

chaps1y ago

The Department of Justice disagrees and voluntarily releases column and table names: https://www.justice.gov/afp/media/1186431/dl?inline=

gwd1y ago

> I don't understand the argument that knowing the column names doesn't help an attacker?

Would knowing the structure of Illinois governmental organizations help someone perform social engineering attacks against them? Yes, absolutely.

Should Illinois therefore keep the internal structures of their organizations -- the department names and the officials who run them -- secret? No, absolutely not.

AdamJacobMuller1y ago

> And I don't think I disagree with the court on schema vs. file layouts either.

3 more replies

dmurray1y ago

And this part seems self-defeating:

> Attackers like me use SQL injection attacks to recover SQL schemas. The schema is the product of an attack, not one of its predicates”.

If it's the product of an attack, but not the end goal, surely it's of value to the attacker?

The proposed amendment sounds like a good way to fix this - is it likely that will pass?

2 more replies

econ1y ago

If you have an injection friendly application then that is the security problem.

Say someone hacks the db, is the problem easy to guess table names? The column should never have be called "passwords"?

Perhaps 30 years ago that would sound good.

Obscurity should hardly ever be a line of defense. If it is the only defense the problem isn't that it wasn't obscure enough.

Edit:

1 more reply

ic4l1y ago

I agree with you. Knowing the exact column names can speed up an attack and, in some cases, make it more feasible.

2 more replies

fsckboy1y ago

>It's not the file layout, but it's analogous...How do you argue they aren't analogous?

laws don't get to be analogous

foia request: "I'd like the report the committee prepared about the costs for the new bridge"

response: "denied. the report contains costs laid out in tables with headings, which while not being schemas are analogous, with schemas not being files but being analogous"

mcv1y ago

Yeah, I think it's still useful info for an attacker. But only if the system was actually developed by amateurs who never heard of parameterized queries.

I find it a bit bizarre that the city uses "our system was developed with no consideration for security" as a valid defense.

IshKebab1y ago

'); SELECT * FROM logins --

3 more replies

foota1y ago

Out of curiosity, could you ask for something like "one row of data from every table in the CANVAS database"?

mbreese1y ago

2 more replies

hathawsh1y ago

Kudos to you for enduring through this fight! We can only achieve transparency when people choose not to be complacent. Thank you.

What do you think are the next steps?

chaps1y ago

My first step is to actually finish my post :)

But after that, getting a reasonable law passed to fix this now-broken nonsense.

doctorpangloss1y ago

What are the administrators of CANVAS hiding?

chaps1y ago

3 more replies

butlike1y ago

'ethnicity' header, 'net_income' header... wouldn't doubt chicago could be cave man enough to do this

maCDzP1y ago

manquer1y ago

I think the point of the lawsuit is less about CANVAS schema itself and more about the ability of the government to hide this kind of information from FOIA requests.

notjulianjaynes1y ago

gwerbret1y ago

mcnichol1y ago

I don't want to take away any steam from your sails but giving bad information in regards to case law shouldn't be taken lightly. Your "expert witness" did you a disservice.

I'm glad that you are challenging them but I believe a credible engineer would have made mince meat of your expert and hurt the rest of us who want to see you successful.

It's possible rewriting certain statutes can help us but there is no company worth its salt that would share DB schema.

thayne1y ago

> Just knowing the structure is not far off from knowing the max entropy a password may hold

> there is no company worth its salt that would share DB schema

1 more reply

ra1y ago

They can produce a report using english language labels instead of the db column names. Their argument isn't fact it's vexatious obstenance.

hennell1y ago

waitwhatwhoa1y ago

When can we submit witness slips? Is there a mailing list for updates we can join? Good luck!

hn_user821791y ago

This older post was such a fantastic read, thanks for sharing your story!

layoric1y ago

It's dated from ~2 weeks ago... is there other date information I am missing?

2 more replies

monksy1y ago

What I want to know: How much malort does the city expensive a year?

foota1y ago

Welcome to Seattle :-)

geoduck141y ago

> that's the second biggest FOIA request I've ever seen!

-Guybrush, from The Secret of Monkey Island

mmaunder1y ago

Thanks for fighting the good fight for us all!

rnewme1y ago

The footer links to dead x account.

SkidanovAlex1y ago· 11 in thread

If I understand the argument of the author here:

> Attackers like me use SQL injection attacks to recover SQL schemas. The schema is the product of an attack, not one of its predicates

> the problem is that every computer system connected to the Internet is being attacked every minute of every day

Knowing schema both expedites their ability to take advantage of the vulnerability, but also increases their chances of probing the injection without triggering the query failure to begin with.

florbnit1y ago

> that knowing SQL schema doesn't help the attacker.

The barrier is not “any helpful information must be censored” the barrier is “don’t disclose passwords or code that would divulge backdoors” a schema cannot be that.

Volundr1y ago

I'm not an attacker, just a boring old software dev. If there's an SQL Injection I'd say all bets are off re: schema.

So I actually do agree that the schema could potentially be of marginal benefit to the attacker.

butlike1y ago

Wouldn't admitting this in court pin you with some sort of negligence? (if you knew having a schema revealed would compromise your app in some way).

2 more replies

pockmarked191y ago

Reminds me that the recently discovered “leak emails using YouTube” exploit kicked off from reading what is essentially, a schema.

https://brutecat.com/articles/leaking-youtube-emails

robocat1y ago

> kicked off from reading what is essentially, a schema.

I wouldn't call json a schema.

In the HN discussion tptacek replied that "$10,000 feels extraordinarily high for a server-side web bug": https://news.ycombinator.com/item?id=43025038

It's hard to know what the reputational cost to Google would be for doxxing popular anonymous accounts. I'm guessing video is not so often anonymous so influencers are generally not unknown?

1 more reply

tptacek1y ago

kmoser1y ago

1 more reply

lucb1e1y ago

> nothing you can do with the schema is going to reduce the signal in that feed --- even a single SQL syntax error would be worth following up on

Syntax errors coming from your web application mean there is a page somewhere with a bugged feature, or perhaps the whole page is broken. Of course that's worth following up on?

1 more reply

wglb1y ago

> "query failed" or "query succeeded, here's the data"

There is nothing in the log on the server that indicates an error.

Most of the relatively introductory SQL injection exercises that I taught proceed without any knowledge of the schema.

This is why SQL injection is so insidious.

berkes1y ago

Not just with SQLi, but I've managed to statistically proof "information" with timing attacks.

Where if you join another table (by e.g. requesting extra info in a graphql query) the response goes from ms to s or even m. Indicating the size of the joined table.

Even if the response content is exactly the same, we know things exist, are big, not indexed, or simply present, by timing the attack.

gerdesj1y ago

For example: I've just re-wired a three gang light switch. I verified power on with my multimeter (test the meter), cut the power and then retested all the circuits to make sure I had got it right.

I noted down the connections, and took them all out. I put Wagos over the flying ends to make them safe, turned the power back on and got on with the job in hand.

I then cut the power (both circuits) checked again with my Fluke. Oh bollocks ... enable power, test the Fluke and then cut power again and recheck the circuits.

So, given that description is it possible that the connectors might fall out in the future and short on say, the metal back box. Of course it is possible. It could happen but would it happen?

The would/could distinction is a powerful one and it is analogous to how we do risk assessments.

Cripes - I need to get out more!

tptacek1y ago· 10 in thread

Kurt posted this to troll me. Just know my audience here was, mostly, non-technical people involved in politics in my local Chicagoland municipality.

skissane1y ago

> Local politics is, by contrast, extremely responsive. I've gotten things done --- including a law passed

You live in a country where local governments have the power to make laws… in a lot of other countries they don’t - or, to be more precise, their lawmaking power is extremely limited.

bobthepanda1y ago

Even state legislators are, by their nature, pretty much locally driven given the relatively small size of their constituencies and thus the margin of victory.

copypasterepeat1y ago

Would you care to elaborate which law you helped to pass?

Thanks.

tptacek1y ago

My real goal is zoning.

1 more reply

hinkley1y ago

“Never doubt that a small group of thoughtful, committed citizens can change the world: indeed, it's the only thing that ever has.” - Margaret Mead

Y_Y1y ago

Like a hedge fund? Or are we including those committed to violence?

4 more replies

zahlman1y ago

How do you figure out where to go?

tptacek1y ago

AceJohnny21y ago

> (drastically unlike national politics)

Man, I remember your & Maciej's effort to get FIDO keys to the campaign staffers, and how depressing that was

chaps1y ago

Aaaaaaa! I need to finish my post! :(

pavon1y ago· 9 in thread

gregw21y ago

I'm not sure why that wasn't argued by the state and the state argued the database schema was a "file format". Per my reasoning, the state still would have won, but for different reasons.

My 2 cents as someone designing database schemas for 25+ years.

hot_gril1y ago

Schema is definitely software, a operating protocol, source code, and file layout. Maybe also documentation.

tptacek1y ago

A schema isn't software in the sense imagined by the ILGA. If it was, every Excel spreadsheet would be too, and Excel spreadsheets are the basic currency of FOIA.

The court definitively established that SQL schemas aren't source code in the sense imagined by the ILGA. SQL queries can be. Schemas are not.

See downthread for why a schema isn't a file format. In fact, a schema is almost the opposite of a file format.

A court will look at the term "documentation" in the ordinary sense of the word; as in, "a prose description and set of instructions".

"Associated with automated data processing operations" isn't an element in the statute; it's a description of all of the elements.

2 more replies

pavon1y ago

1 more reply

paulddraper1y ago

How is a database schema not a file layout?

kasey_junk1y ago

The article describes why. 2 different db engines (or even instances) can use different file layouts for the same schema.

In many was sql is all about divorcing the schema from the files.

3 more replies

hyperpape1y ago

It literally does not describe a file, and does not literally describe the data layout of anything on disk (though with enough knowledge, you may be able to infer facts about probable layouts).

1 more reply

dools1y ago

michaelmrose1y ago

Because it doesn't describe how data is laid out on disk.

1 more reply

Y_Y1y ago· 7 in thread

Even after all that, if the basic sentence structure of what's in the law isn't clear to the courts, hasn't the whole system fallen at the first hurdle?

copypasterepeat1y ago

kmoser1y ago

skissane1y ago

2 more replies

Xelynega1y ago

Correction, that is how common law legal system works.

Alternatives like codified law exist and are practiced, just not in the US or Canada.

tptacek1y ago

Y_Y1y ago

Even (some) programmers have learnt the dangers of parsing at run time (e.g. "eval is evil"). How can we decide it's the law we want if we don't know what it means yet?

2 more replies

olau1y ago

I find it slightly odd that you get hung up on the file format thing. The law as you quoted it says "including but not limited to" and the first example given is then "software".

EMIRELADERO1y ago· 5 in thread

Am I the only one slightly perplexed/worried by the point-blank source code exemption?

It's easy to imagine a scenario where the city decides to develop a specific software in-house and hide the "biases" in the source code, or any other thing one might not find desirable.

Hell, they don't even need to make everything from scratch! Could just patch and use a permissively licensed 3rd-party component.

In my opinion, the proposed amendment does not go far enough.

manquer1y ago

It shouldn't be surprising ?

---

[1] yes kernel problems are largely a function of GPL, more permissive licenses like Apache 2 /MIT would not have, BSD variants after all had no challenges in supporting ZFS.

Y_Y1y ago

1 more reply

contravariant1y ago

In theory the decision to put those biases in the code should be public information. You can ask for the criteria the software was made to, just not the software itself.

Though rulings like this might have a chilling effect.

qingcharles1y ago

dotdi1y ago

That's why it's important to push for "public money - open source" initiatives like some countries in the EU are trying to implement.

Off the top of my head, I think the last (now failed) German coalition had this in their programme but didn't deliver. Maybe the new government will.

duxup1y ago· 5 in thread

Very interesting read.

Working at a small company with lots of clients I wouldn't want to hand out DB schema outright, but I also go out of my way to search / get the client the data they want ... not reject them.

rectang1y ago

A private company wouldn't want to divulge their DB schemas because it's advantageous for competitors to see how you're doing things. That doesn't apply to government databases.

chaps1y ago

Furthermore, in your response, please be mindful that the burden of proving competitive harm rests on the public body [2].

[1] https://www.cityofchicago.org/content/dam/city/depts/dps/Con... [2] http://foia.ilattorneygeneral.net/pdf/opinions/2018/18-004.p...

bob10291y ago

The schema on the last project I worked on was probably our most important IP. Specifically, the ways in which we solved certain circular dependency issues.

I wouldn't take the ability to design a schema for granted. I don't think many people are any good at it. Do not underestimate the value of your work products.

1 more reply

hinkley1y ago

bornfreddy1y ago

Maybe. But now I'm really curious how bad that schema must be for them to hide it so viciously.

4 more replies

aqueueaqueue1y ago· 5 in thread

Interesting takeaways from me:

All that pompous sounding legalese can still be ambiguous! I feel less bad for not understanding contracts that have 100 word compound sentences.

Legal people can't keep up with our tech jargon but they have their own jargon including "predicate" lol. So same logical thinking, different jargon framework.

Question: why do they want the schema not the data?

tptacek1y ago

Because once you have the schema you can issue FOIA requests that include queries for them to run.

darkarmani1y ago

Is the schema considered private information or just information not required to be released via FOIA? ie: Can't some nice employee leak this information or is it legally protected?

Once the information is released, can anyone can make FOIA requests using the schema?

1 more reply

hot_gril1y ago

What if you guess common table names? Wonder if they send back the error message.

aerzen1y ago

Could you ask them to run an introspection query? Something like SELECT * FROM information_schema.tables?

aqueueaqueue1y ago

Oh wow! If that is necessary, that is so kafkaesque!

"I want your data"

"What data?"

"What do you have?"

"Ha ha. No. Tell me what you want"

"Your data that is the metadata of your data"

"Well actually..."

...

1 more reply

Terr_1y ago· 4 in thread

> Each spreadsheet has a header row, labeling the columns, like “price” and “quantity” and “name”. A database schema is simply the names of all the tabs, and each of those header rows.

This is also how I explain it to my relatives, I'm kind of surprised this analogy (one so direct that it's almost literal) didn't fly with the judges.

If database column names cannot be revealed, then shouldn't that mean the state is also able to redact the headers of all their spreadsheets?

kmoser1y ago

Knowing a spreadsheet header doesn't help an attacker gain access to that spreadsheet in any way. Knowing SQL column names may give an attacker an advantage in accessing a database.

Terr_1y ago

Compare: "Knowing the writing style of current employees may give an attacker an advantage while phishing, therefore, we cannot turn over any memos or emails whatsoever."

Ditto for the org-chart.

flutas1y ago

Per the post, this also wouldn't fly.

1 more reply

butlike1y ago

It's a reverse vlookup

DangitBobby1y ago· 4 in thread

When a law is ambiguous by wording, why do they never ask the people who drafted the law what was intended?

jaza1y ago

That would be against the separation of powers doctrine inherent in all Western democracies. The job of the legislature is to write the law. The job of the judiciary is to interpret the law.

DangitBobby1y ago

1 more reply

tptacek1y ago

The current sitting ILGA is not the ILGA that passed the statute.

DangitBobby1y ago

They are probably still alive, shouldn't be that hard to find. They have no problem giving subpoenas to other witnesses or soliciting expert testimony.

wswope1y ago· 4 in thread

Anyone with a legal background willing to opine about potential workarounds to this ruling?

Specifically, would a request for “data field labels” (i.e. a column list without any table structure info) likely circumvent the exemption?

gpm1y ago

I think that would run afoul of

> The one big limitation of Illinois FOIA (with FOIA laws everywhere, really) is that you can’t use them to compel public bodies to create new records.

Unless for some reason they already had a list of columns without table structure.

(Not that I claim to have a legal background)

wswope1y ago

I had that thought too, but my naive rebuttal would be that the column data already exists by default in any standard RDBMS as information_schema.columns. No new record creation required.

1 more reply

duxup1y ago

Yes but what if we come up with a directive that every FOIA request must be logged into a DB. Therefore every request is automatically invalid as it requires we create a record!

1 more reply

Andys1y ago

Not a lawyer, but why not use opensource as an example? Many successful public e-commerce websites have public schemas and aren't all hacked.

ajkjk1y ago· 4 in thread

tptacek1y ago

Being able to recover schemas from publicly operated databases is vital to keeping public records and data public, rather than de-facto hidden from inquiry.

Matt's suit was anything but a waste of people's time. Hopefully, it'll result in a change to our state law.

hot_gril1y ago

Just because the article gets into fine details doesn't mean it's silly. They're working with what they have.

zonkerdonker1y ago

See here: https://news.ycombinator.com/item?id=43176625

jbritton1y ago

bobsmooth1y ago· 3 in thread

What stands out to me about this article is the time between court appearances. Seems like if you want to accomplish anything in court you need to be prepared to spend years of your life on it.

rectang1y ago

And of course, people and entities (private or as in this case public) who have a lot of resources take advantage of that, a state of affairs which often serves to perpetuate injustice indefinitely.

1 more reply

lucb1e1y ago

barbazoo1y ago

I thought the same thing. Sure it's async but still you have to keep this in your mind for a very long time.

jaxgeller1y ago· 2 in thread

I FOIA'ed >1M pages of docs for my project cleartap.com, a DB of water quality of the USA.

Most states would charge a small amount to gather the documents.

Michigan wanted $50K to for the FOIA request. I think because of the Flint lead crisis. They wanted me to go away.

davethedevguy1y ago

I noticed that you do have data for Flint. Did you have to pay it, or is there some appeals process if you're quoted an unreasonable amount?

Great project by the way!

jaxgeller1y ago

Ended up finding the majority of Michigan through scraping.

For example, https://www.cityofflint.com/wp-content/uploads/2023/06/Annua...

inetknght1y ago· 2 in thread

> You also generally can't FOIA the source code of programs they run.

Alas, that part should be illegal under FOIA.

Source code should be open source and verifiable. Being exempt from FOIA circumvents public confidence in the government's use of software.

I'd be curious to learn if/where courts have decided such things already.

jaza1y ago

inetknght1y ago

> maybe there's software talking to that database, whose source code includes the logic of picking when / where parking inspectors should conduct a "random" blitz of issuing fines.

Jean-Papoulos1y ago· 2 in thread

I understand freedom of information, but what exactly does the public gain by Matt getting the database schema ?

And yes, having your db schema out in the wild can be a vector of attack, if only because it allows targeting the sql injections (the blog author himself argues this in court).

The court was right to reject this. Maybe the exact word of the law doesn't ask for it, but the spirit certainly does.

gizmo1y ago

tptacek1y ago

The blog author argued no such thing, because that is not true.

lcnPylGDnU4H9OF1y ago· 2 in thread

> where the only way to get at the underlying data is to FOIA a database query

Was this ever attempted?

  SELECT * FROM `information_schema`.`tables`;

chaps1y ago

Yep, that was done in the FOIA request related to this lawsuit:

  select utc.column_name as colname, uo.object_name as tablename, utc.data_type as type
  from user_objects uo
  join user_tab_columns utc on uo.object_name = utc.table_name
  where uo.object_type = 'TABLE'

https://www.muckrock.com/foi/chicago-169/canvas-database-sch...

lcnPylGDnU4H9OF1y ago

Yeah, it's obvious the double standard here, then. Curious indeed why they are so adamant to keep the schema/data secret.

3 more replies

neilv1y ago· 2 in thread

> [...] where the only way to get at the underlying data is to FOIA a database query.

Can you request the desired information using natural language, based on your guesses of what information they store?

tptacek1y ago

Probably not, because then you'd be asking them to go do research. You FOIA for specific documents and records.

neilv1y ago

So you can ask for the document that is the inspection report from Mel's Diner on date 11/11/2024?

Can you ask for the database record from dispatching that inspection visit to Mel's Diner on 11/11/2024, even if you don't know the exact database column names and relations?

1 more reply

gowld1y ago· 1 in thread

lucb1e1y ago

> So much of litigation is Kabuki theater, grant rhetoric not in any way intended at achieving a just or logical outcome

probably_wrong1y ago· 1 in thread

Random thought: someone should drive to Chicago, get a parking ticket, and then make a FOIA request for all of their information contained in that database.

It won't be the whole database schema, but it would be a start.

chaps1y ago

Short answer -- already been done.

This (spoiler) visualization's going into my eventual post about the lawsuit: https://observablehq.com/d/026992341cc47ff0

alexashka1y ago· 1 in thread

Wowzers, that was a lot of words to express something that's very simple.

A database schema is just an empty form. By looking at an empty form, you know what fields have be filled in, what type of information they'll contain, etc.

Of course people making data requests need to know what forms are being used to collect and store information.

As for security - not letting people do anything because 'it might be dangerous' is bonkers. The way to secure databases has been known for decades. Let's start living in the 21st century :)

tptacek1y ago

1 more reply

hnthrow903487651y ago· 1 in thread

>just self-important message-board hedging

I can confidently say it does not stop at message boards for many people, self included

tptacek1y ago

It's a real issue when writing an affidavit or testifying. Lots of ingrained bad habits.

koolba1y ago· 1 in thread

I sure hope the impact of this is not that government entities switch to schema less databases!

CharlesW1y ago

"Schemaless" is like "serverless" in that there's always a schema, even if it's not enforced by the database and instead applied dynamically by the application layer.

rafram1y ago· 1 in thread

hondo771y ago

I have this cousin, Vinny, who's a lawyer, and he was able to use his girlfriend as an expert witness. Both sides agreed she really knows her stuff because that's what really matters.

scotty791y ago· 1 in thread

> Does the “would jeopardize” language in the statute apply to everything in the exemption, or just to the nearest noun “any other information”?

I think law and lawmaking would be vastly improved if only lawyers learned the miracle of parentheses.

Ylpertnodi1y ago

Comma's can be expensive, too.

gervwyk1y ago· 1 in thread

Should have used mongodb in the first place.

qbxk1y ago

lol'd so hard at this

dylan6041y ago

"Retrieve the data of every parking ticket issued to ‘Bob O’ and also all the rest of the information in the database including everyone’s passwords."

This is the example of SQL Injection written in plain English, yet "everyone's" is problematic here in that it's an orphaned single quote. If "Bob O'Conner" is bad, so is "everyone's"

kingforaday1y ago

Given the Illinois Supremes decision, seems like an opportunistic time to say "Everything is a file".

1. https://en.m.wikipedia.org/wiki/Everything_is_a_file

boxed1y ago

rubymancer1y ago

It's Matt Champan! https://mchap.io/

I helped him process and visualize the original batch of parking ticket data waaaay back in 2016.

I can't believe he's still on this in 2025. We need more junkyard dogs like him fighting for what's right.

djeastm1y ago

I suppose I need to change all my column names to random 16-character strings so I don't leave my database insecure!

indymike1y ago

There is no fredom of information if the public is not allowed to know what data the government has.

lq9AJ8yrfs1y ago

Feels like there is an important theme here that SB0226 is dancing around --could government be legible in addition to being "plain-text" transparent?

"plain-text description" of "each field of each database of the public body" and "specific database queries" may not do what you mean.

Not sure how to fix it though.

I could see gratuitous ORMs and database-of-databases patterns winning tax dollars with taunt-them-with-the-schema listed as a feature.

thayne1y ago

I'm confused why file layout is included in the list of exceptions in the first place. If an adversary knowing your file format is a security problem, then you are doing something very wrong!

makach1y ago

Does disclosure of a database schema really jeopardize the security of the system? Yes

How plausible or likely does that jeopardy need to be? Very

Does a database Schemas constitute “source code”? Yes

Is a SQL schema a “file format”? No & yes. In that order.

And, finally, does the “would jeopardize” language apply to everything in the exemption, or just to the nearest noun “any other information”? Yes

irrational1y ago

> I’ll conclude this long piece by saying (1) obviously the bill should pass, and (2) it should be called “The Chapman Act”.

(3) I imagine Chicago greatly regrets towing Matt Chapman "over a facially bogus ticket".

gavin_gee1y ago

https://x.com/JackRhysider/status/1885732851779285184

b81y ago

Got to see this happen day by day on the Midwest Venture Partners Slack. There was another lawsuit Chappman and Tom did for laser based speed detection in Chicago.

pudding123451y ago

Do stored procedures count as part of the schema? I've recently found a SQL injection vulnerability in a client's SP that was using concat (very badly)

ngriffiths1y ago

> Congratulations! You now understand databases.

Data engineering: doing a lot of fancy work to make a very simple product

el_snark1y ago

Enjoyed the read. Good luck with the future developments.

Now a nerdy question. As someone who investigates SQL injections, why are you running a server based on nginx 1.4.6? Do you know something I don't? :-)

abfan11271y ago

am I the only disappointed there's no mention of little Bobby Tables?

gunian1y ago

sql injection court seems more fun than slave court where they tell you spending anything above 5 is a crime lmaooooo

lucb1e1y ago

Since there is no contact method on the website, figured I'd mention it in a comment; hope this helps

1 more reply

lubujackson1y ago

j / k navigate · click thread line to collapse