- Serve traffic behind a load balancer that has a WAF
- Network segregation for database (separate subnets)
- Make sure you serve https and have a cert that’s valid. Redirect to https if http
- Restrict ports on LB
At some point later:
- Endpoint monitoring and threat detection
- VPC flow logging
- Execute backend as non root
- Dependency / artifact scanning
- Cloud SIEM to monitor common actions taken
- Make sure no hard coded creds. Ie, use role-base auth with cloud providers
- Reproducible infrastructure builds with infra as code
- Email domain protection
- Grab misspellings of domain names to prevent squatting
whats the cheapest non aws way to do this? cloudflare on everything? is there another option? just trying to learn whats out there. WAF mainly protects against ddos right?
The cheapest option would be self-hosting something ModSecurity compatible: https://en.wikipedia.org/wiki/ModSecurity
You'd also need a ruleset, for which the OWASP one might be a starting point: https://owasp.org/www-project-modsecurity-core-rule-set/
There are also some projects like Coraza in the works: https://coraza.io/
Probably not what you're looking for if you want a cloud service to take care of everything for you, though, because of the question below (just thought that it might be useful to point out that anyone can run their own WAF if need be).
> WAF mainly protects against ddos right?
Typically WAF might be offered as a part of a larger cloud service that would include DDoS protection.
However, on its own, it is meant to filter traffic that might be harmful and attempt to exploit various vulnerabilities. A bit like an anti-virus in a sense, but for web requests. Some people argue that WAF solutions can be problematic because they encourage an attitude of "so what if there's a log4j vulnerability in the codebase, the WAF will take care of it" instead of making sure that the actual code is secure, but opinions are split there (defense in depth and the Swiss cheese model).
Network segregation for database (separate subnets) would be a config option wherever you're hosting (AWS/Google Cloud/etc.) said database/application.
What is a WAF?
It’s a feature of an LB that consolidates the actions of blocking ports except for the ones you are using, fail-fast on paths that scrapers tend to check (e.g. /wp-admin, /phpMyAdmin) so it doesn’t end up in normal request logging, set rate limits, fail-to-ban conditions, etc.
Implement some basic rate limiting by IP so you don't get your Google Maps API DoSed. Block China and Russia altogether unless you expect customers from there (sadly, many bots & drive-by scans originate there). Sanitize your inputs, especially if you have any that will reach one of your own endpoints like for a database lookup (and look into SQL injection prevention in general). Use prepared statements in PHP if you use that for DB access. Not sure about Python.
You can read OWASP guidelines for other best practices (https://owasp.org/www-project-top-ten/) or ask ChatGPT to summarize. But realistically, Cloudflare takes care of so much that it seems a bit foolhardy to try to DIY it these days...
If it were me doing this, I wouldn't self-host anything at all, and just use managed services all the way down, including the DBs. A lot less maintenance that way, especially for solo devs. Lets you focus on the business logic instead of trying to reinvent your own secure little nano cloud. It takes serious manpower to stay on top of the latest vulnerabilities and zero-days, and IMO it's not worth spending your limited time on that when the big clouds can do it much more cheaply and much more thoroughly... it's a full-time job in and of itself, and you still probably wouldn't keep up with all the latest attacks =/
Of course you end up learning less this way because other professionals do all the hard work for you. But unless you want to become a backend/security professional yourself and REALLY dive deep into this stuff, I don't think just having basic security skills is going to do you much good anyway, since it takes all of 30 seconds to spin up a pre-hardened cloud host these days, usually for free, and they will have much more exhaustive coverage. Just my 2c.
While I very much understand where this sentiment comes from. Please do not blindly recommend CF.
Cloudflare seems invisible for gullible users, but is unusable and hostile to humans.
I use a VPN to a static IP by Hetzner, not to hide my true identity. But because I have to, my current living situation has my (only available) internet running through a corporate network, packet filtering/logging and all. (Yes this is all legal and I am grateful).
But still to remain any kind of privacy I have to use a VPN. My public IP is registered directly to my full name and has not changed in 3 years.
I also try and limit the amount of unnecessary data my browser transmits.
The combination of those has CF absolutely convinced that I am a existential threat to any site they so honorably "protect".
I simply cannot use ANY site with the default CF configuration. And no, I'm not the only one. This is a very common problem among humans that don't want to share everything about them to pass a human verification.
Cloudflare is the cancer of the Internet. They protect and enable criminals, only to sell the solution later. All the while, ridiculing humans into giving up more and more data in the name of safety. They trick users with promises of "Securing the connection" when they are just matching the browser to their database to sell another page visit. The internet used to be a free and open connection to the world, cloudflare has build a panopticon of surveillance and false security and they are being praised for it.
There's a CloudFlare "essentially off" option that I've always hoped would make a difference when it comes to that. I always set it to that when setting websites up with CloudFlare, in hopes that it makes a difference.
That way I can still make use of the CDN and all the other features of CloudFlare without actually bugging visitors.
Would you be willing to load one of my websites[0] and let me know if "essentially off" actually works for you? If it does, great, but if it doesn't, I'll at least be aware that CF is a problem no matter what setting you put it at.
Yes, it sucks that a few (very few, in my experience) real users might get affected, but that's outweighed by the thousands if not millions of other useless bot visits that would otherwise get through. None of the small orgs I've worked for had the time or personnel to manually filter through those otherwise... it's just too much.
That said, whenever I could, I would happily tweak the rules or make an IP whitelist exception for real users who emailed us complaining they couldn't access something because of Cloudflare, but that only ever happened one or twice as far as I can remember.
--------------
> The combination of those has CF absolutely convinced that I am a existential threat to any site they so honorably "protect".
I'm sure you know this, but CF isn't a targeted attack towards you. Your usage patterns are just different from most people's, and unfortunately gets treated as a bot because it looks like one. You can email the site operators to ask for an exception, or... frankly... probably they'd just rather lose you as a customer than deal with making the website work for you :(
If the alternative is to either spend 10x more time on securing the website manually, or loosen security such that it impacts all their other customers... it's usually a no-brainer to choose to just live with the false positives instead and deal with them on a case-by-case basis as they come in.
> Cloudflare is the cancer of the Internet. They protect and enable criminals, only to sell the solution later. All the while, ridiculing humans into giving up more and more data in the name of safety.
I think our experiences have been different in this regard. IMO they are one of the most useful service providers on the Web, not just for WAF stuff but also their excellent CDN and serverless products, etc. You don't have to agree, but they didn't become this big by offering a bad product... probably most site operators would value overall server stability more than an atypical user's needs.
* Disable SSH access for 'root' username.
* If you're using JWTs anywhere, don't mistake them for encryption - they are not.
* Check you're only serving over https.
* Don't trust your frontend. Any security check built into the frontend is near-useless, as the user can reprogram it however they like.
* Strings is how you let the baddies in, especially if you manipulate and concatenate them. Read about SQL injection to find out more.
I would love to understand the assumptions that lead to this belief. It makes negative sense?
Just escape every input: For sql, to avoid sql injection: https://datacadamia.com/data/type/relation/sql/parameter For html, if somebody try to inject html: https://datacadamia.com/web/html/entity
You got 99% of security holes patched.
All the best
One other thing is to limit input frequency, only allow a certain amount of posts over some period of time. Enforce this on both the front and back-end.
A little more complex, you can set a lifetime limit per user by IP address, which won't stop a truly dedicated attacker but will definitely block most of the random web crawler scripts that find your site.
As you launch more and spend more time dealing with users the default things to do will become second nature, and you’ll find yourself using the built in tools from AWS, DigitalOcean, CloudFlare, etc. rather than rolling them yourself.
But seriously, just launch. There’s a really good chance you won’t have any problems.
Fast roll-back/restore is a useful feature for improving availability but does nothing to improve security.
Never trust your frontend data ever!
Always assume the attacker can talk to your API.
Don't do auth or login yourself. Use known libs, workflows asks.
Have unit tests to verify your endpoints need auth (valid user not just a anonymous user)
Now, this does not allow me to say do python web-apps (that are not WASM). Hostinger has VPS for quite cheap I would consider if I needed that (if AWS lambda does not make sense, I did a python google cloud app engine for a month, https://crimede-coder.com/graphs/Dallas_Dashboard, and that was pricey, like $80 a month, whereas the WASM app is no additional cost). And I am sure there are other vendors that are similar (I am just happy with Hostinger).
So in terms of DDOS protection this is not so great, but that would not be a big deal to me. So site goes down, but I do not rack up a bill or anything.
For a google maps application, I not un-commonly see people put API keys in javascript client side (not good!) I mean it depends on what exactly you are doing, but if it is a public service that users do not sign into, just rate limiting the number of API queries in some PHP + database logic server side should be not too much work and reasonable to not rack up a surprise bill (I forget if google allows you to limit the API keys directly or if they will just rack up bills).
To turn the question around a bit - you've identified the possible routes of compromise/exploitation (i.e. untrusted user input). The first step to me is a threat model. Work out the "so what" of why someone would try to attack you. What would be their end-goal?
To give you a few first steps, you've mentioned using a Google Maps API, and searching based on device location. Presumably your use of the Maps API is paid, and therefore a potential motivation for an attacker is financial, coming from your use of that API. Therefore treat that (i.e. the ability to make requests using your Google Maps API key) as a "target" in your architecture.
From there, you can do things to be a less attractive target (rate limiting, limiting results shown, if you are charged per-result). You could also review your code logic to ensure that only the right kind of request can be made (i.e. that someone modifying the client-side can't trick your server into accidentally making entirely arbitrary paid maps API requests on their behalf).
At this point, you'd also want to figure out your threat model between client-side and server-side, and what is exposed where. Assuming your server-side makes the API requests to Google Maps (and if not, then you're presumably exposing your API creds to clients, which is a "stop right here, don't proceed" moment!), what is allowed to flow from client to server? Can a rogue client get your server to make an arbitrary query? Would that let them use you as a free Google Maps API broker?
Understanding the trust architecture between front and back-end is (for me at least) key, as that's the primary exposed attack surface to an end user. Open up developer tools (F12), and look around requests as you use the app. Is there anything here that you wouldn't want users to see? As attackers will definitely see that, and it will be the first place they go to look at what you are doing!
Other ways to mitigate these risks could be (if you have sufficiently constrained input sets) to implement caching to avoid the ability to rack up queries against the underlying maps API. Given you are using arbitrary user locations, that's a bit harder. If users have a session or other short to medium term identifier, you could do some smart rate limiting to detect rampant scanning of large areas by making API requests that spoof the device location to be loads of different locations.
If you follow this process, and work out what's worth attacking (your infrastructure will be one of them - even just to compromise the site, post spam, etc, as will things like any database you run), then you can begin to understand those risks, and work out where there are attack vectors, and mitigate them methodically. The OWASP top 10 guidelines are a good starting point - often the biggest issues are design mistakes, omissions of basic omissions, or flawed attempts to implement basic measures. If you have authenticated API endpoints, for example, is the authentication logic correct, and meaningful? Does it actually do what you intend, and is what you intend sufficient for the level of security you want to have?