I think the reasons for this are complex.
First, security rules as implemented by Firebase are still a novel concept. A new dev joining a team adding data into an existing location probably won’t go back and fix rules to reflect that the privacy requirements of that data has changed.
Second, without the security of obscurity created by random in-house implementations of backends, scanning en masse becomes easier.
Finally, security rules are just hard. Especially for realtime database, they are hard to write and don’t scale well. This comes up a lot less than you’d think though, as any time automated scanning is used it’s just looking for open data, anything beyond “read write true” as we called it would have prevented this.
Technically there is nothing wrong with the Firebase approach but because it is one of the only backends which use this model (one based around stored data and security rules), it opens itself up to misunderstanding, improper use, and issues like this.
Unlike a backend where where the rules for validation and security are visible and part of the specifications, Firebase's security rules is something one can easily forget as it's a separate process, and has to be reevaluated as part of every new feature developed.
What kind of apps are people building where you don't need backend logic?
This is much better than trying to figure out what are the security-critical bits in a potentially large request handler server-side. It also lets you do a full audit much more easily if needed.
This is a battle we slowly lost. It started with all of support being the original team, then went to 3-4 fulltime staff plus some contracts, to entirely contractors (as far as I’m aware).
This was a big sticking point for me. I told them I did not believe we should outsource support, but they did not believe we should have support for developer products at all, so I lost to that “compromise.” After that I volunteered myself to do the training of the support teams, which involved traveling to Manila, Japan and Mexico regularly. This did help but like support as whole, it was a losing battle and quality has declined over time.
Your experience is definitely expected and perhaps even by design. Sadly this is true across Google, if you want help you’d best know a Googler.
This begs the question, isn't this a security vulnerability after all?
A big problem with writing security rules is that almost any mistake is going to be a security problem so you really don't want to touch it if you don't have to. It's also really obvious when the security rules are locked down too much because your app won't function, but really non-obvious when the security rules are too open unless you probe for too much access.
Related idea: force the dev to write test case examples for each security rule where the security rule will deny access.
Our use of Firebase dates back 10+ years so maybe the modern rules tools also do this, I don't know.
What would really help us, though, would be:
1. Built-in support for renaming fields / restructuring data in the face of a range of client versions over which we have little control. As it is, it's really hard to make any non-backwards-compatible changes to the schema.
2. Some way to write lightweight tests for the rules that avoids bringing up a database (emulated or otherwise).
3. Better debugging information when rules fail in production. IMHO every failure should be logged along with _all_ the values accessed by the rule, otherwise it's very hard to debug transient failures caused by changing data.
It's a conceptual model that is not sufficiently explained. How we talk about it on own projects is that each collection should have a conceptual security profile, i.e. is it public, user data, public-but-auth-only, admin-only, etc. and then use the security rule functions to enforce these categories — instead of writing a bespoke set of conditions for each collection.
Thinking about security per-collection instead of per-field mitigates mixing security intent on a single document. If the collection is public, it should not contain any fields that are not public, etc. Firestore triggers can help replicate data as needed from sensitive contexts to public contexts (but never back.)
The problem with this approach is that we need to document the intent of the rules outside of the rules themselves, which makes it easy to incorrectly apply the rules. In the past, writing tests was also a pain — but that has improved a lot.
But with the firebase security rules, I now pretty much have half of a server implemented to get the rules working properly, especially for more complex lookups. And for those rules, the tooling simply wasn't as great as using typescript or the likes.
I haven't used firebase in years tho, so I don't know if it has gotten easier.
There are good reasons we deride "security through obscurity" as valid, and just because "structural diversity" makes automated scanning harder doesn't mean it can't be done. See Shodan.
> After the initial buzz of [pwning Chattr.ai] had settled down, […]
Insane.
Some days I think one ought to be licensed to touch a computer.
The vulnerability emails probably got dismissed as spam, or forwarded on and ignored, or they’re caught in some PM’s queue of things to schedule meetings about with the client so they can bill as much as possible to fix it.
> Some days I think one ought to be licensed to touch a computer.
There are plenty of examples of fields where professional licensing is mandatory but you can still find large numbers of incompetent licensed people anyway. Medical doctors have massive education and licensing requirements, but there is no shortage of quack doctors and licensed alternative medicine practitioners anyway.
Not saying you should do that given the current state of the laws.
I’d be wary of any company listed here that made that decision and hasn’t changed leadership, as it has been proven time and time again that many companies simply don’t care enough about customers enough to protect them. History repeats itself.
If so, I hadn't realized how common that architecture had become for sites with millions of users.
Could be a mix. Firebase also offers Firebase Functions which are callable functions in the cloud. That code is not public.
However, Firestore or Firebase realtime database both require the user to setup security rules. Otherwise all data can be read by anybody.
Writing appropriate authz rules on the backend has to be made easy.
Still this makes the interent scarier. Most people don't have a clue how fragile the web is and how vunerable they are.
Services as time goes on makes making websites easier, and abstracts more stuff, which makes devs oblivious to what they have to configure.
It’s not enough; make sure to use a unique email for each service you sign up for. This limits the damage in case of an incident and protects your privacy, as no one can perform OSINT on you to cross reference other services. Additionally, I’ve found that sometimes you can detect a site breach before the owners do when you receive a malicious email sent to that unique address.
Unfortunatly that's a big hassle that I am not willing to go through.
Apple's approach to pseudo emails was very nice and in my experience, works very well, but as mainly PC user I can't take advantage of this.
Do you know or recommend a service for this thats easy and fast to use?
Anyone have more info about this issue? I've got a scraper myself in Python with a few hundred threads which seems to eat a lot of memory. Any workarounds or is the only solution to rewrite in another language?
Personally I prefer using processes rather than threads, with a worker pool and a message bus rather than shared memory. That solution has its own drawbacks (and a bit more overhead), but you don't need to worry so much about memory issues. Processes also seems a better match for crawlers since the number of processes will be fairly constant and the work the processes do is fairly independent.
import multiprocessing as threadingRewriting it is the only real solution, I don't know your exact problem.
I’d be interested to know how you’re coming to the conclusion that the amount of affected users is likely higher. From the looks of it, I’d suspect that at least some of the sites you mention (gambling, lead carrot) to be littered with fake account data.
Why we are saying its more is there is likely other services not in our scan list that could be vulnerable.
0: https://firebase.google.com/docs/reference/admin/java/refere...
We only scanned for firestore, which is a NoSQL database, conversion tools may still be possible, a good firebase alternative would be https://supabase.com, but please set up RLS, its IMO much easier then Firebase.
Postgres is an "object database", so you could use Array, JSON or JSONB fields wherever necessary, and you shouldn't introduce any foreign key relations or such.
And we're still in the Wild West when it comes to internet business even after 20 years of "verified" domains.
"Your firebase allows anyone to read information"
"i want to be your gf"
edit: oh wait. Did it hit itself with the bat?
I have no idea what they're trying to communicate lol. In my experience Line stickers are often used the same way "huh" is, when you don't know wtf else to say but you'd like to terminate the conversation somehow.
Not only that but they provide the same crappy services to schools and scummy gambling websites alike.
It frustrates me watching people who believe they are professionals flock to these services. Honestly, if you can't roll your own you probably shouldn't let someone roll this for you. But Google won't say no and none of you cloud devs can help yourself. So we have this race to the bottom in cost and first to market and all the products are least common denominator shit that gets built in 6 hours by copy pasting as many GH repositories together as possible on rented infra.