I just read an article which states:
Internet domain addresses opened up to wave of new suffixes
Internet naming board approves huge expansion of approved domain extensions with .hotel, .bank, or .sport auctions likely.
Twenty-six years after .com was first unveiled to the world, officials have swept away tight regulations governing website naming, opening up a whole world of personalised web address suffixes.
But... I just learned how to validate email addresses by checking (among others variables) the number of characters used after the dot (i.e., .com, .fr, etc.). What now?
Analysts say they expect 500 to 1,000 domain suffixes, mostly for companies and products looking to stamp their mark on web 开发者_开发问答addresses, but also for cities and generic names such as .bank or .hotel.
Maybe this is not a problem. But how are we going to validate email addresses? What’s the plan?
IMO, the answer is to screw email validation beyond <anything>@<anything>
, and deal with failed delivery attempts and errors in the email address (both of which are going to happen anyway).
Related:
- How far should one take e-mail address validation?
As I've answered elsewhere, this regex is pretty good at handling localization and the new TLDs:
(?!^[.+&'_-]*@.*$)(^[_\w\d+&'-]+(\.[_\w\d+&'-]*)*@[\w\d-]+(\.[\w\d-]+)*\.(([\d]{1,3})|([\w]{2,}))$)
It does validate Jean+François@anydomain.museum
and 试@例子.测试.مثال.آزمایشی
, but it does not validate weird abuse of those nonalphanumeric characters, for example '.+@you.com'.
Validating email addresses beyond a check for basic, rough syntax is pointless. No matter how good a job you do, you cannot know that an address is valid without sending mail to it and getting an expected reply. The syntax for email addresses is complex and hard to check properly, and turning away a valid email address because your validator is inadequate is a terrible user experience mistake.
See What is the best regular expression for validating email addresses?.
It’s with the current TLD's already quite impossible to verify email address using regex (and that’s not the fault of the TLD's). So don't worry about new TLD's.
The way I see it, the number of TLDs, while much larger than today's, will still be finite and deterministic - so a regex that checks against a complete list of possible domain suffixes (whether that list is your own or, hopefully, provided by a reliable third-party such as ICANN) would do the trick.
精彩评论