I tried to sign up for SiriusXM the other day, and though I could create an account with my .pro email address, I couldn't actually sign up for service with that same address for some reason. It's frustrating that validating email addresses is still something that people get so wrong. Please just take whatever seeming garbage I've entered into your email address field and try to send a message to it.
(Their site also had stupid password generation rules such that I couldn't use the 21-character one my password manager auto-generated, but even after I made one that followed the rules on the page, it was still rejected because there were apparently rules on the back end that weren't spelled out in the front end. Please hire me, SiriusXM.)
For example, I've received multiple warnings from Intercom that I need to improve deliverability of my email, or they will ban my account. Ironically one of the suggestions is to use confirmation emails - but that's exactly where the problem is for me.
A tool like this helps me to weed out a ton of these undeliverable email addresses to avoid sending emails that will hit my spam score.
> checking for a bounce
So in my case, generating that bounce is exactly what I need to avoid in order to make sure my account remains in good standing.
Not judging or anything though I know my tone might seem that way.
We were onboarding a large new client to our SAAS product. This process involved creating accounts for all of their employees (tens of thousands) and sending emails with an activation link. (Where they'd be able to set up their password.)
Our system sends these emails in batches, and as soon as the first batch went out we got an alert from our monitoring system that our bounce rate was surging - high enough to risk a sending pause from Amazon SES. We stopped sending and investigated the issue, and it turned out that the email list we were given was a mess - it included all current employees, but also a huge number of former ones. Just under 1/10th of the emails in our first batch were invalid.
We asked the client to give us a better list, but due to internal issues they couldn't get that to us any time soon. Meanwhile they were breathing down our necks to get these emails out ASAP, and they were a large enough client that our management wanted to keep them happy, so we tried out one of these email validation services. Unfortunately, it didn't work. It turns out that this technique doesn't work for all mail servers. It was reporting every email as valid, even ones we knew were invalid since they'd already hard bounced.
(Edit: thinking back - this was several years ago - I think it wasn't saying that they were valid emails, just that it couldn't tell whether they were valid or not - the service was able to detect that the server wasn't rejecting non-existent addresses.)
We ended up unpausing the emails and just hoping for the best. Ended up with something like an 8% bounce rate that eventually fell off our record as our normal sending patterns resumed. Amazon's guidelines say they might cut you off when you hit 10%, so we cut it pretty close.
It's terrible to spend a lot of effort on this kind of tech just because some business partner has shitty customer support.
Keep your "overall bounce rate" low, by ALSO sending out extra emails to confirmed email addresses. Like, for every "confirmation" email, also send a "thanks for joining us" email to someone that already confirmed their email.
In the past when I worked on a system that needed to notify via email we always had a way to change delivery process for certain emails, domains, etc for exactly this reason. This is one of those cases where we would “deliver directly” (i.e. send directly to their mail provider).
If the address is to a large host, then they will use reaching invalid email addresses as evidence that you are not keeping to best practices. They will throttle deliverability, and possibly reject email.
If your sending to an invalid host, then your mail sending provider (if your using one) may consider you a bad customer and send you through a lower grade of outbound IP addresses.
Frequently new registrations are processed at once as batch imports from another system or from a partner. There is a need to remove these invalid email addresses pre-sending and hurting sending reputation.
If you work at a company that would abuse hibp and a direct mention in CANSPAM you should refuse the work.
A recent example was the CEO of Evernote for the work put in to their behind the scenes series although I don't expect anyone to read it of course. People are busy!
I wrote a bit about it here: https://utf9k.net/blog/email-lookup/
Now does this scale? Not at all and I haven't read the email spec or anything like that. It's also handy in a pinch if you wrote down an email but can't remember if it's spelled correctly or not.
nslookup -query=mx evernote.com
Edit: Trying out the macOS/Linux invocation on Windows also works: nslookup -q=mx evernote.com
Oddly, the first time I did this, I only got IPv4 results, subsequent queries for the same domain, included IPv6 as well.The practice of email confirmation is still widely used, but the change in email deliverable rules has make it a pain to properly validate them.
Even if you are using a 3rd party provider like SES or mailgun, they have a email bounce limit. A considerable number of real world users give fake email address(which is even sometimes encouraged on HN) which triggers those bounce limits.
To fix it, there are paid services but they does not work very well. Fixing it yourself take a lot of engineering time, that is better spent elsewhere.
Providing an open solution to this problem (which is given in the github repo) is a double edge sword. As this gives a edge to spammers who created the problem in the first place.
// IsValidEmail takes a string and returns whether it’s a valid email address
The AI response was a nightmarish 100+ character regex that made my blood curdle.I think of email validation like encryption: don’t roll your own, and don’t trust an AI to do it either.
Edit: Here's the regex: https://gist.github.com/cassidoo/6101ef0657665683b787aab5ae9...
This is a regex that validates a string against the RFC822 "Standard for ARPA Internet Text Messages" and it contains 6424 characters.
The fewer times you bounce, the better your chances of not being marked as a spammer.
This could also be useful for an ecommerce site, where you want to be able to easily contact the buyer if there's a delay, address correction needed, etc. People typo their own email at a rate that's surprising.
Forget about AI and self driving car, we can't even get email validation right in 2021.
Now, if you are buying lists here and there to spam the hell out of it, the bounce rate would flag you very quick and you'd need to find another smtp provider every week. This service would be your life line.
Do that too often and servers can start black listing your domain/IP because it looks like you're "scanning" for available email addresses.
For some of these legacy accounts, the registered email addresses may have typos.
So, even a basic DNS check against the existence of the domain’s MX record is helpful.
Any of the ‘suspect’ email addresses can then be further evaluated by a human, and then removed or fixed.
As a result the only place such a service is useful is for someone who has a ton of low-value emails they don’t trust, and they don’t want a ton of bounces when they hit send (which risks losing your send privileges with pretty much high-volume email platform).
So they run all their emails through a service like this, and only send to the ones marked valid. This excludes a ton of emails that actually are valid, but failed the check (false negatives). But that’s ok because the emails were low-value to begin with.
If this sounds like a spammy operation… bingo. Technical email validation services are really only useful for people who are doing things like buying email lists from commercial providers, harvesting emails from sites like HN, or forcing people to enter an email address to do basic things with a free service.
Given their questionable business practices, their customer service dark patterns, their dated and awful UX, and their inevitable demise to much more popular streaming services, you’d be best to stay far away.
Honestly not where I thought that rant was heading ;)
I have no email address that this counts as anything other than "risky".
If this opts me out of marketing mail then that's probably a good thing, but I hope nobody puts a password-reset or security/billing notifications behind it.
I ask because it is something I have always thought about, but I suppose I kept hoping a service would come along and magic the solution for me. Kudos on making it happen!
So far it has been really great. Easy, effective.
Edit: Like the other reply you got, I use FastMail for this service.
This is easy to do with the Alias feature of FastMail.
I've used it briefly for testing purposes and I have no complaints about it, it delivered what I expected with no hiccups.
Couldn’t the seller just remove the prefix from all emails before selling them?
EDIT: Looks like they are indeed doing the SMTP method: https://github.com/reacherhq/check-if-email-exists/blob/a052...
I also found a similar, much bigger service here that appears to have been around for a while: https://emailverification.whoisxmlapi.com/api
There are products that definitely make it past the seed round and sometimes even as public companies before enforcement notice that their entire product runs afoul the law.
Eep. My email is listed half a dozen times in Have I Been Pwned records, but I use different passwords for every site, so this means nothing.
def has_user_been_pwned(email):
return True
There. It's nearly impossible to be on the Internet at all without having some account or another be involved in an exploit at some point. You could rename the endpoint `user_had_a_facebook_or_twitter_or_linked_account_or_has_a_credit_score()`. This is a worthless thing to query because it tells you absolutely nothing about the owner of the address.What a way to ruin a wonderful thing! Abusing the haveibeenpawned service in this way has worried me enough that I’ve now gone and removed my data from the publicly searchable database. I’ll use the notification service instead.
Don’t blame the tool.
Is an unknown classification supposed to be treated as "I don't know, probably safe" or "I don't know, probably don't accept it" ?
{
"input": "***redacted***",
"is_reachable": "unknown",
"misc": {
"is_disposable": false,
"is_role_account": false
},
"mx": {
"accepts_mail": true,
"records": [
"in2-smtp.messagingengine.com.",
"in1-smtp.messagingengine.com."
]
},
"smtp": {
"error": {
"type": "TimeoutError",
"message": "future has timed out"
}
},
"syntax": {
"address": "***redacted***",
"domain": "***redacted***",
"is_valid_syntax": true,
"username": "***redacted***"
}
}
I'm going to guess that fastmail has blocked you lol {
"input": "***redacted***",
"is_reachable": "unknown",
"misc": {
"is_disposable": false,
"is_role_account": false
},
"mx": {
"accepts_mail": true,
"records": [
"mx.zoho.com.",
"mx2.zoho.com."
]
},
"smtp": {
"error": {
"type": "SmtpError",
"message": "permanent: 5.7.1 Email cannot be delivered. Reason: Email detected as Spam by spam filters. "
}
},
"syntax": {
"address": "***redacted***",
"domain": "***redacted***",
"is_valid_syntax": true,
"username": "***redacted***"
}
}> 451 4.7.1 <marcus@utf9k.net>: Recipient address rejected: Temporary deferral, try again soon
"smtp": {
"error": {
"type": "SmtpError",
"message": "permanent: The host name specified in HELO does not match IP address."
}
their server is misconfiguredThis is a particularly interesting use of AGPL because it appears to contain a RESTful HTTP server built-in.
To my knowledge, with the way AGPL works, there are some interesting wrinkles:
- It is only intended to “trigger” when distribution occurs under some legal copyright law definition of “distribution.”
- It allows commercial use of unmodified and modified instances of the licensed code, as long as you provide the modified code, just like GPL.
- Being a copyright license and not a contract (at least not intentionally,) it only is “viral” to derived works and not aggregate works. So, depending on what you define a “derived work” as, some interactions between AGPL and non-AGPL code may be kosher.
Where this gets interesting to me is wherever you draw the line for derived works. For example, Ghostscript’s developers have a page regarding this subject:
https://www.ghostscript.com/doc/current/Commprod.htm
Particularly:
> The application calls GPL Ghostscript in a way that allows an ordinary user to substitute another program for GPL Ghostscript. (Typically this requires use of a shell script or batch file, or a system call like "exec".) More precisely, if the user deletes from the computer system all the files in the GPL Ghostscript directories, and replaces the GPL Ghostscript executable with another program with the same name and conforming to the same documentation, the application will continue to work with it. One implication of this is that the GPL Ghostscript documentation must specify all properties of GPL Ghostscript on which the application relies; for example, if GPL Ghostscript has been modified by the addition of command line switches or language elements such as new operators, the documentation must describe any such additions that the application uses.
If your AGPL application exposes a trivial JSON API, could you not write another application that simply supports a compatible interface, have it take an endpoint URL at runtime, and then just setup the software on another server and point to it? Although your software could be non-AGPL and maybe even closed-source, it could in theory be swapped out for any compatible service, including a simple noop implementation.
Assuming the author(s) retain the copyright for all contributions so far, they are obviously able to use the program without worrying about said licensing restrictions. But if you flip it around and someone else also runs a SaaS where they distribute the source as per AGPL restrictions, they could then offer said services and presumably it would not be possible for AGPL to have “virility” to spread outwards further. I can’t think of any reason this scenario wouldn’t work the same if done on internal networks by a single entity.
There is perhaps no particular takeaway here. In fact, maybe this was even intended to be a potential use case. However, I worry that this loophole may not be being considered:
> If you want to use check-if-email-exists to develop commercial sites, tools, and applications, the Commercial License is the appropriate license. With this option, your source code is kept proprietary. Purchase an check-if-email-exists Commercial License at https://reacher.email/pricing.
While it is obviously true that directly integrating the library into an application would indeed constitute a derived work, I am skeptical, based on what I know, that using a fairly generic REST API would necessarily constitute this.
I’m always a fan of open source as a model for better software development, but I do think that one really needs to be careful that it’s actually what they want. AGPL is a very interesting beast and there seems to be a lot of subtleties with regards to its implications in edge cases.
Notwithstanding any other provision of this License, if you modify the Program, your modified version must...
https://www.gnu.org/licenses/agpl-3.0.htmlThat said I would be interested if anyone could clarify to what extent a copyright license has any legal power if you're not distributing anything.
Ignoring whether the keygen/crack itself is illegal, redistributing it with a trial version could be. Aside copyright licenses, there’s sometimes clickwrap licenses that disallow you from redistributing the trial at all. On the other hand, I believe it is unclear if a copyright license itself (as opposed to a clickwrap agreement) can actually disallow distribution based on other things it is aggregated with. This isn’t a terribly big issue for AGPL and GPL because they explicitly limit their terms to not apply:
> A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an “aggregate” if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate.
Simple: no license, no copy.
It doesn’t matter how many copies you make, you still need a license. The “defence” of “I didn’t agree to the license” is basically admitting knowing and wilful infringement.
Is the address provided by a known disposable email address provider?
Is the email address bound to a known free email provider?
Does email address under test hide a honeypot?
What are the legit use-cases for this? To be sure you can force spam on your users and identify them as ad-targets?In our SaaS we enforce slightly stronger limits for trial accounts who sign up from free/disposable emails. User is Gmail? Well, sorry, +XX to "spam score". They will probably use our system for spamming.
In Germany an Imprint for a paid service is mandatory and after the ToS it seems like this is a product from a company in France? (I believe there are similar laws in France?)
> The Telemediengesetz (German meaning "Telemedia Act") requires that German websites disclose information about the publisher, including their name and address, telephone number or e-mail address, trade registry number, VAT number, and other information depending on the type of company.
The relevant German laws (§5 TMG and §55 RStV) are fairly vague, so lawyers recommend publishing an Impressum if your website contains any commercial content (for example, ad banners) or any journalistic content (for example, blog posts).
(note I'm just using mcD & walmart as place-holder company names, have no idea if they operate their public wi-fi that way)
Also, there's no DB. Each verification is done in real-time.
All MTAs that I’m aware of support this, and it’s moderately common on personal domains with a single self-hosted MX, rare as those are these days.
For those wondering, this is actually specified in rfc5321 section 5:
If an empty list of MXs is returned,
the address is treated as if it was associated with an implicit MX
RR, with a preference of 0, pointing to that host.
https://datatracker.ietf.org/doc/html/rfc5321#section-5[0]:https://help.reacher.email/reacher-licenses#31b18f7872fc4480...
Also, just because I have a catch all on my domain really shouldn’t be justification for saying my email isn’t valid and is a good way to lose business.
Agreed, 100% of the e-mails I use to sign up for services go to a catch-all.
Its not about how useful it is to others. Its how it makes the web worse for users when their perfectly valid email address gets rejected because the flawed library said it was fake.
Then send a follow up confirmation email, no captcha required.
It depends really on how valuable your service is and how easy you can make it for them to send you an email , for example auto filling subject and body using the mailto query parameters , so that they just need to click the send button on their email app helps a lot.
It does reduce spam on contact forms , that’s for sure, but I don’t yet have enough data to say whether it hampers signups in a bad way.
My clients are pretty happy with this method tho , as it works fine for their domain and who they target on their landing pages.
Personally I kinda like this method of signup , at the same time I find it annoying for when I’m not signed into the email I want to use on the particular device I’m using , while I could still send the email from another device , it does remove the ease of having it auto filled for me.
But everything has pros and cons, And I kinda like the new upside down way at-least when I build them for the services I work on.
Also it has a few issues : While I’ve practically never faced this issue in real life yet , but I can think of someone trying it.
Is if they fake the MAILFROM header , I’ve got a few emails personally where they faked the mailfrom and mailto headers , unless you verify them with dkim they can still spam you , but it’s very rare and I haven’t seen any abuse of this method as of yet , however if more people start using it , I can see it happen.
Now I'm slightly worried that in addition to counting me as a robot and an attacker, online services will think that my email address does not exist.
How to check if an email address exists without sending an email? - https://news.ycombinator.com/item?id=436817 - Jan 2009 (6 comments)
Indeed they can, my mail server seems to block the service because it already appears on multiple spam lists.
1. https://github.com/jeronimofagundes/EmailValidator#available... (PHP)
Most real humans have at least one address involved in a data breach, but most don't have access to hundreds of emails in a data breach. That means most people can only make use of the "one free ice cream per customer" deal once.
Pretty neat!
https://github.com/reacherhq/check-if-email-exists/issues/91
Identifying something that needs doing, especially when you have no idea how to do it, is a bold skill.
Too often have I seen engineers be reluctant to open an issue because they don’t know how to implement it, technically. I still do it, myself. If you need it then you’ll find a way.
The converse is also true: losing focus by filing tasks and procrastinating on features because you know how to build them, not because you actually need them to move your business / project forward.
tl;dr, According to RFC 5321, `RCPT TO` command succeeds with 250 and 251. So email is valid if you get to this part of the protocol and receive the response.
i hope nobody ever uses this project, lest it break the usefulness of mytrashmail.com