> How long is necessary?
As long as is needed for the stated purpose. If you're doing IP-based rate limiting with a 1 hour window, it probably doesn't need to still be in your systems >12 hours from now. If you're doing longer term IP reputation or something, keeping it around longer can probably be justified.
> What does limited mean?
The same. Long enough to serve its purpose, and no longer (without justifiable exception, such as being evidence of an actual crime, etc)
> Does a regulator now get to determine what sort of algorithms I can use
Not really, any more than they already do.
"Not guilty, Your Honour; you see, we do store people's HIV status against their real names on the public blockchain, but don't worry, it's ROT-13 encrypted! Twice!"
Also, remember that it's not really the IP that you care about (from a privacy perspective). An IP+timestamp is a very discerning selector, if you have any other data at all.
Nobody knows that '192.168.1.1' is actually me. And even if they did, does it really matter?
But maybe they know that only $IP hit /orders/confirm within 5 minutes of some other system recording that $ME placed an order with other details.
From a privacy standpoint, it's your ability to cross-correlate that IP and whatever else you know about it that could allow identifying and tracking/profiling the actual person using it.
Suppose your marketing dept asked you to scan the last few weeks of security logs to see if you'd had any hits from ranges belonging to $BIGCORP who you're in tense negotiations with? Is that Ok? Or would you refuse because the security logs are collected exclusively for certain purposes of which that isn't?