There’s also the issue of CGNAT. If you rate limit too strictly based on IP address, you harm users who are stuck with CGNAT, especially in Asia and Africa. India is particularly problematic.
As for stuff like Luminati, if you’re being sufficiently sneaky, chances are you’re not going to snowball in the first place. I’m not sure why anyone would bother paying for Luminati to crawl sites like the one for which I work, but I have seen people use it to scam.
We can’t really be bothered to waste resources blocking well-behaved crawlers. Just keep it at a reasonable pace, respect errors (especially 429, but also 410 and 503), and ensure we have a way to contact you if things go wrong.