IPDetective started as a hobby project for my other hobby projects :) and I decided to wrap a simple website around and offer it as a service.
Let me know what your thoughts, if you find value in this service or if you have any feature requests.
There are a few sites I could use this for, but some of them would also end up sending private customer ips for lookups, and for that reason I just manually check the suspicious ones and manually don't look up the ones I know have passed the 'already a member and human not hacker' gate.
I keep look for something I can just add lists to my server and check against that, but not going to spend thousands on it.
Also from a performance perspective, I can't do hundreds of millions of requests a month over the wire. That just a waste.
Also IPDetective can be used just as a detection solution rather than a prevention solution.
Sometimes I use it against my nginx access logs and see how much of my traffic could be from bots.
> I keep look for something I can just add lists to my server and check against that, but not going to spend thousands on it.
What do you mean by this? Like having the ability to create your own deny/black list of ip addresses and then you can validate against it?
> What do you mean by this? Like having the ability to create your own deny/black list of ip addresses and then you can validate against it?
They want to download your database and run queries on it locally, rather than call your API. This way they don't share data with you and don't need to worry about your data practices.
Do you have any data showing your service is better/more accurate/better false positive rate than competing offerings?
>IPDetective collects data from about 60+ different sources such as official cloud provider endpoints and public VPN/Proxy/Tor/Bot net lists.
Are you on the up and up with all those public sources? I'm not sure which ones you're sourcing, but many do not allow commercial use, resale, or at the very least have some attribution clause to keep security companies from mooching off crowdsourced data.
>No, currently IPDetective does not support ipv6 addresses. However this feature is on the road map.
That's a major shortcoming in 2022.
Finally, as itake pointed out bellow, I'm not giving you my email just to run a simple test query and see what the results look like.
>Do you have any data showing your service is better/more accurate/better false positive rate than competing offerings
I have signed up to some of the competitors and my service averaged about 20ms per request for a single IP address. From the competition it was typically around 200ms to 300ms. Regarding better false positives that would be rather hard to do.
> Are you on the up and up with all those public sources?
I gathered from sources that did not have any licensing related to non-commercial use. Would you recommend I reach out to all of the sources regardless even if they say you can use it for commercial use? Or do not say anything about commercial use at all?
> No, currently IPDetective does not support ipv6 addresses. However this feature is on the road map.
Yeah I know, just focusing on IPv4 right now. I am still collecting ipv6 addresses however I have not implemented it in the service yet.
Thanks a bunch for writing the comments above.
> That's a major shortcoming in 2022.
In the US, it’d be nice if national ISPs (fiber, cable, and mobile) thought this.
Without this you were also probably inadvertently contributing to the de-democratization of email. I used to run my own email off a cloud server, but that time has passed.
Currently the known address is scoped to the user/client who is using the service.
Do you have any plans to expand the business? Captchas as a service? A cheap version of CF’s proxy maybe?
2/ I've tried blocking vpn or proxy users (by ip hosting provider), but found too many false positives. Using a VPN is common for IT professionals or consumers trying to 'protect' their privacy and this impacted the growth of my app.
How is your tool better at detecting a bot vs a human using a vpn?
Okay I will look into a solution.
2/ I've tried blocking vpn or proxy users (by ip hosting provider), but found too many false positives. Using a VPN is common for IT professionals or consumers trying to 'protect' their privacy and this impacted the growth of my app.
That good to know, I do not find that I have a lot of false postivies but I would imagine it all depends on audience.
>How is your tool better at detecting a bot vs a human using a vpn?
It does not know the difference as or right now. I was thinking about adding a user-agent validation as well which could add another layer.
Presumably this service exists because bots try to avoid detection. I don't think UA validation would really help much and there are plenty of libraries already that do this.
I should have added context (or at least an anecdote) to help you during any product roadmap meetings...I used to oversee web ops for global real estate company (one of the biggest in the U.S. and the woirld)...and our consumer-facing websites would show tons and tons and tons of listings of residential homes for sale. Of course we really were just showing listings from our data store as well as other real estate companies who agreed to share listings data. As in many data sharing and data synch arenas, there are data issues. The most common scenario: "hey, you're showing a home that sold X time ago...stop showing it!" And, in all cases it was "someone else's data". But to the customer, or even other realtors, we were the bad guys. Even realtors who should have known betteer that there are always data issues that we can not fully control; at least nopt at the source...would complain to us. Now, one might assume that once the source data updates properly, the "correct data" should flow through, right? Well, not in real estate! Clearly this is a different arena. But i've learned that even if its only to remove a local cahce of data, its a good idea to give users and/or customers a mechanism to at least properly communicate to you about stale data...and if appropriate, maybe evenallow self-service for a user/customer to get rid of the "bad data". Obviously this merits putting in place protocols to avoid abuse...but i hope you get the idea. Good luck!
VPN usage is increasingly common by consumers and in my country I've seen ads for it in places ranging from Mozilla in my browser to NordVPN on my TV. There's massive overshare of IPv4 addresses and it's really, really annoying to find out that you can't use a service because you're on a naughty list. I feel like yelling "I am a customer and want to buy something!" on occasion – the net result is I go elsewhere.
Stopping abuse and bot detection is one thing. Banning people for something they might have literally no control over is quite another.
I think it really is dependent on the consumer application. But mentioned by some other folks is to add the ability to have IP exceptions, essentially an allow list, which I am in favor of. This brings up an entirely different issue, which is identifying the user as legitimate.
Disclaimer: I work for a company that operates in a similar space so don't go into too much secret sauce if you don't want to :)
The largest thing I try to look at is to make sure my sources are not older than 1 year. Sometimes these sources break and I update them or they go offline all together. I am always looking for new hosting providers, vpn and proxies.
Also I can do further port analytics on these ips as well. A lot of the VPN services have specific ports open used by VPN services. I would say that automating every source from the start was a good way to start so I can stay up to date.
I have a feeling that there would be overwhelming overlap.
That might not be bad, just low hanging fruit everyone can get.
The speed at acquiring high probability threats I guess would be a better / more valuable comparison.
Perhaps running two or more of these offerings side by side for a couple of months.
I would prefer a feed of ALL, with frequent updates, instead of querying a 3rd party every time. (well obviously you can cache responses for a period of time locally at least)