php RFC兼容的电子邮件地址验证
甚至可以验证IPv6地址中的域名部分,实在是强大。
代码:
<?
/*
Copyright 2009 Dominic Sayers
(dominic_sayers@hotmail.com)
(http://www.dominicsayers.com)
This source file is subject to the Common Public Attribution License Version 1.0 (CPAL) license.
The license terms are available through the world-wide-web at http://www.opensource.org/licenses/cpal_1.0
*/
function is_email ($email, $checkDNS = false) {
// Check that $email is a valid address
// (http://tools.ietf.org/html/rfc3696)
// (http://tools.ietf.org/html/rfc5322#section-3.4.1)
// (http://tools.ietf.org/html/rfc5321#section-4.1.3)
// (http://tools.ietf.org/html/rfc4291#section-2.2)
// (http://tools.ietf.org/html/rfc1123#section-2.1)
// Contemporary email addresses consist of a "local part" separated from
// a "domain part" (a fully-qualified domain name) by an at-sign ("@").
// (http://tools.ietf.org/html/rfc3696#section-3)
$index = strrpos($email,'@');
if ($index === false) return false; // No at-sign
if ($index === 0) return false; // No local part
if ($index > 64) return false; // Local part too long
$localPart = substr($email, 0, $index);
$domain = substr($email, $index + 1);
$domainLength = strlen($domain);
if ($domainLength === 0) return false; // No domain part
if ($domainLength > 255) return false; // Domain part too long
// Let's check the local part for RFC compliance...
//
// Period (".") may...appear, but may not be used to start or end the
// local part, nor may two or more consecutive periods appear.
// (http://tools.ietf.org/html/rfc3696#section-3)
if (preg_match('/^\\.|\\.\\.|\\.$/', $localPart) > 0) return false; // Dots in wrong place
// Any ASCII graphic (printing) character other than the
// at-sign ("@"), backslash, double quote, comma, or square brackets may
// appear without quoting. If any of that list of excluded characters
// are to appear, they must be quoted
// (http://tools.ietf.org/html/rfc3696#section-3)
if (preg_match('/^"(?:.)*"$/', $localPart) > 0) {
// Local part is a quoted string
if (preg_match('/(?:.)+[^\\\\]"(?:.)+/', $localPart) > 0) return false; // Unescaped quote character inside quoted string
} else {
if (preg_match('/[ @\\[\\]\\\\",]/', $localPart) > 0)
// Check all excluded characters are escaped
$stripped = preg_replace('/\\\\[ @\\[\\]\\\\",]/', '', $localPart);
if (preg_match('/[ @\\[\\]\\\\",]/', $stripped) > 0) return false; // Unquoted excluded characters
}
// Now let's check the domain part...
// The domain name can also be replaced by an IP address in square brackets
// (http://tools.ietf.org/html/rfc3696#section-3)
// (http://tools.ietf.org/html/rfc5321#section-4.1.3)
// (http://tools.ietf.org/html/rfc4291#section-2.2)
if (preg_match('/^\\[(.)+]$/', $domain) === 1) {
// It's an address-literal
$addressLiteral = substr($domain, 1, $domainLength - 2);
$matchesIP = array();
// Extract IPv4 part from the end of the address-literal (if there is one)
if (preg_match('/\\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$/', $addressLiteral, $matchesIP) > 0) {
$index = strrpos($addressLiteral, $matchesIP[0]);
if ($index === 0) {
// Nothing there except a valid IPv4 address, so...
return true;
} else {
// Assume it's an attempt at a mixed address (IPv6 + IPv4)
if ($addressLiteral[$index - 1] !== ':') return false; // Character preceding IPv4 address must be ':'
if (substr($addressLiteral, 0, 5) !== 'IPv6:') return false; // RFC5321 section 4.1.3
$IPv6 = substr($addressLiteral, 5, ($index ===7) ? 2 : $index - 6);
$groupMax = 6;
}
} else {
// It must be an attempt at pure IPv6
if (substr($addressLiteral, 0, 5) !== 'IPv6:') return false; // RFC5321 section 4.1.3
$IPv6 = substr($addressLiteral, 5);
$groupMax = 8;
}
$groupCount = preg_match_all('/^[0-9a-fA-F]{0,4}|\\:[0-9a-fA-F]{0,4}|(.)/', $IPv6, $matchesIP);
$index = strpos($IPv6,'::');
if ($index === false) {
// We need exactly the right number of groups
if ($groupCount !== $groupMax) return false; // RFC5321 section 4.1.3
} else {
if ($index !== strrpos($IPv6,'::')) return false; // More than one '::'
$groupMax = ($index === 0 || $index === (strlen($IPv6) - 2)) ? $groupMax : $groupMax - 1;
if ($groupCount > $groupMax) return false; // Too many IPv6 groups in address
}
// Check for unmatched characters
array_multisort($matchesIP[1], SORT_DESC);
if ($matchesIP[1][0] !== '') return false; // Illegal characters in address
// It's a valid IPv6 address, so...
return true;
} else {
// It's a domain name...
// The syntax of a legal Internet host name was specified in RFC-952
// One aspect of host name syntax is hereby changed: the
// restriction on the first character is relaxed to allow either a
// letter or a digit.
// (http://tools.ietf.org/html/rfc1123#section-2.1)
//
// NB RFC 1123 updates RFC 1035, but this is not currently apparent from reading RFC 1035.
//
// Most common applications, including email and the Web, will generally not permit...escaped strings
// (http://tools.ietf.org/html/rfc3696#section-2)
//
// Characters outside the set of alphabetic characters, digits, and hyphen MUST NOT appear in domain name
// labels for SMTP clients or servers
// (http://tools.ietf.org/html/rfc5321#section-4.1.2)
//
// RFC5321 precludes the use of a trailing dot in a domain name for SMTP purposes
// (http://tools.ietf.org/html/rfc5321#section-4.1.2)
$matches = array();
$groupCount = preg_match_all('/(?:[0-9a-zA-Z][0-9a-zA-Z-]{0,61}[0-9a-zA-Z]|[a-zA-Z])(?:\\.|$)|(.)/', $domain, $matches);
$level = count($matches[0]);
if ($level == 1) return false; // Mail host can't be a TLD
$TLD = $matches[0][$level - 1];
if (substr($TLD, strlen($TLD) - 1, 1) === '.') return false; // TLD can't end in a dot
if (preg_match('/^[0-9]+$/', $TLD) > 0) return false; // TLD can't be all-numeric
// Check for unmatched characters
array_multisort($matches[1], SORT_DESC);
if ($matches[1][0] !== '') return false; // Illegal characters in domain, or label longer than 63 characters
// Check DNS?
if ($checkDNS && function_exists('checkdnsrr')) {
if (!(checkdnsrr($domain, 'A') || checkdnsrr($domain, 'MX'))) {
return false; // Domain doesn't actually exist
}
}
// Eliminate all other factors, and the one which remains must be the truth.
// (Sherlock Holmes, The Sign of Four)
return true;
}
}
function unitTest ($email, $reason = '') {
$expected = ($reason === '') ? true : false;
$valid = is_email($email);
$not = ($valid) ? '' : ' not';
$unexpected = ($valid !== $expected) ? ' <b>This was unexpected!</b>' : '';
$reason = ($reason === '') ? "" : " Reason: $reason";
return "The address <i>$email</i> is$not valid.$unexpected$reason<br />\n";
}
// Email validator test cases (Dominic Sayers, January 2009)
// Valid addresses
echo unitTest('first.last@example.com');
echo unitTest('1234567890123456789012345678901234567890123456789012345678901234@example.com');
echo unitTest('"first last"@example.com');
echo unitTest('"first\\"last"@example.com'); // Not totally sure whether this is valid or not
echo unitTest('first\\@last@example.com');
echo unitTest('"first@last"@example.com');
echo unitTest('first\\\\last@example.com'); // Note that \ is escaped even in single-quote strings, so this is testing "first\\last"@example.com
echo unitTest('first.last@x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.
x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x2345');
echo unitTest('first.last@[12.34.56.78]');
echo unitTest('first.last@[IPv6:::12.34.56.78]');
echo unitTest('first.last@[IPv6:1111:2222:3333::4444:12.34.56.78]');
echo unitTest('first.last@[IPv6:1111:2222:3333:4444:5555:6666:12.34.56.78]');
echo unitTest('first.last@[IPv6:::1111:2222:3333:4444:5555:6666]');
echo unitTest('first.last@[IPv6:1111:2222:3333::4444:5555:6666]');
echo unitTest('first.last@[IPv6:1111:2222:3333:4444:5555:6666::]');
echo unitTest('first.last@[IPv6:1111:2222:3333:4444:5555:6666:7777:8888]');
echo unitTest('first.last@x23456789012345678901234567890123456789012345678901234567890123.example.com');
echo unitTest('first.last@1xample.com');
echo unitTest('first.last@123.example.com');
// Invalid addresses
echo unitTest('first.last', "No @");
echo unitTest('@example.com', "No local part");
echo unitTest('12345678901234567890123456789012345678901234567890123456789012345@example.com', "Local part more than 64 characters");
echo unitTest('.first.last@example.com', "Local part starts with a dot");
echo unitTest('first.last.@example.com', "Local part ends with a dot");
echo unitTest('first..last@example.com', "Local part has consecutive dots");
echo unitTest('"first"last"@example.com', "Local part contains unescaped excluded characters");
echo unitTest('first\\\\@last@example.com', "Local part contains unescaped excluded characters");
echo unitTest('first.last@', "No domain");
echo unitTest('first.last@x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.
x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456', "Domain exceeds 255 chars");
echo unitTest('first.last@[.12.34.56.78]', "Only char that can precede IPv4 address is ':'");
echo unitTest('first.last@[12.34.56.789]', "Can't be interpreted as IPv4 so IPv6 tag is missing");
echo unitTest('first.last@[::12.34.56.78]', "IPv6 tag is missing");
echo unitTest('first.last@[IPv5:::12.34.56.78]', "IPv6 tag is wrong");
echo unitTest('first.last@[IPv6:1111:2222:3333::4444:5555:12.34.56.78]', "Too many IPv6 groups (4 max)");
echo unitTest('first.last@[IPv6:1111:2222:3333:4444:5555:12.34.56.78]', "Not enough IPv6 groups");
echo unitTest('first.last@[IPv6:1111:2222:3333:4444:5555:6666:7777:12.34.56.78]', "Too many IPv6 groups (6 max)");
echo unitTest('first.last@[IPv6:1111:2222:3333:4444:5555:6666:7777]', "Not enough IPv6 groups");
echo unitTest('first.last@[IPv6:1111:2222:3333:4444:5555:6666:7777:8888:9999]', "Too many IPv6 groups (8 max)");
echo unitTest('first.last@[IPv6:1111:2222::3333::4444:5555:6666]', "Too many '::' (can be none or one)");
echo unitTest('first.last@[IPv6:1111:2222:3333::4444:5555:6666:7777]', "Too many IPv6 groups (6 max)");
echo unitTest('first.last@[IPv6:1111:2222:333x::4444:5555]', "x is not valid in an IPv6 address");
echo unitTest('first.last@[IPv6:1111:2222:33333::4444:5555]', "33333 is not a valid group in an IPv6 address");
echo unitTest('first.last@example.123', "TLD can't be all digits");
echo unitTest('first.last@com', "Mail host must be second- or lower level");
echo unitTest('first.last@-xample.com', "Label can't begin with a hyphen");
echo unitTest('first.last@exampl-.com', "Label can't end with a hyphen");
echo unitTest('first.last@x234567890123456789012345678901234567890123456789012345678901234.example.com', "Label can't be longer than 63 octets");
// Test cases from RFC3696 (February 2004, http://tools.ietf.org/html/rfc3696#section-3)
echo unitTest('Abc\\@def@example.com');
echo unitTest('Fred\\ Bloggs@example.com');
echo unitTest('Joe.\\\\Blow@example.com');
echo unitTest('"Abc@def"@example.com');
echo unitTest('"Fred Bloggs"@example.com');
echo unitTest('user+mailbox@example.com');
echo unitTest('customer/department=shipping@example.com');
echo unitTest('$A12345@example.com');
echo unitTest('!def!xyz%abc@example.com');
echo unitTest('_somename@example.com');
// Test cases from Doug Lovell (LinuxJournal, June 2007, http://www.linuxjournal.com/article/9585)
echo unitTest("dclo@us.ibm.com");
echo unitTest("abc\\@def@example.com");
echo unitTest("abc\\\\@example.com");
echo unitTest("Fred\\ Bloggs@example.com");
echo unitTest("Joe.\\\\Blow@example.com");
echo unitTest("\"Abc@def\"@example.com");
echo unitTest("\"Fred Bloggs\"@example.com");
echo unitTest("customer/department=shipping@example.com");
echo unitTest("\$A12345@example.com");
echo unitTest("!def!xyz%abc@example.com");
echo unitTest("_somename@example.com");
echo unitTest("user+mailbox@example.com");
echo unitTest("peter.piper@example.com");
echo unitTest("Doug\\ \\\"Ace\\\"\\ Lovell@example.com");
echo unitTest("\"Doug \\\"Ace\\\" L.\"@example.com");
echo unitTest("abc@def@example.com", "Doug Lovell says this should fail");
echo unitTest("abc\\\\@def@example.com", "Doug Lovell says this should fail");
echo unitTest("abc\\@example.com", "Doug Lovell says this should fail");
echo unitTest("@example.com", "Doug Lovell says this should fail");
echo unitTest("doug@", "Doug Lovell says this should fail");
echo unitTest("\"qu@example.com", "Doug Lovell says this should fail");
echo unitTest("ote\"@example.com", "Doug Lovell says this should fail");
echo unitTest(".dot@example.com", "Doug Lovell says this should fail");
echo unitTest("dot.@example.com", "Doug Lovell says this should fail");
echo unitTest("two..dot@example.com", "Doug Lovell says this should fail");
echo unitTest("\"Doug \"Ace\" L.\"@example.com", "Doug Lovell says this should fail");
echo unitTest("Doug\\ \\\"Ace\\\"\\ L\\.@example.com", "Doug Lovell says this should fail");
echo unitTest("hello world@example.com", "Doug Lovell says this should fail");
echo unitTest("gatsby@f.sc.ot.t.f.i.tzg.era.l.d.", "Doug Lovell says this should fail");
?>
本文出处参考:http://www.jbxue.com/article/11340.html
php RFC兼容的电子邮件地址验证的更多相关文章
- PHP正则表达式 验证电子邮件地址
我们最经常遇到的验证,就是电子邮件地址验证.网站上常见.各种网页脚本也都常用“正则表达式”(regular expression)对我们输入的电子邮件地址进行验证,判断是否合法.有的还能分解出用户名和 ...
- C++11标准 STL正则表达式 验证电子邮件地址
转自:http://www.cnblogs.com/yejianfei/archive/2012/10/07/2713715.html 我们最经常遇到的验证,就是电子邮件地址验证.网站上常见.各种网页 ...
- shell(sed/gawk)脚本(计算目录文件/验证电话号码/解析电子邮件地址)
1.计算目录文件 #!/bin/bash mypath=`echo $PATH | sed 's/:/ /g'`#注意` ` 和 ‘ ’ count= for directory in $mypath ...
- 验证-- email类型输入框(电子邮件地址)--multiple
如果需要一个用来填写电子邮件地址的输入框,可以使用email类型.这样浏览器可以帮我们验证格式是否正确,而不需要自己写验证规则.原文:HTML5新控件 - email类型输入框(电子邮件地址) 1,只 ...
- [Swift]LeetCode929. 独特的电子邮件地址 | Unique Email Addresses
Every email consists of a local name and a domain name, separated by the @ sign. For example, in ali ...
- 【LeetCode】Unique Email Addresses(独特的电子邮件地址)
这道题是LeetCode里的第929道题. 题目要求: 每封电子邮件都由一个本地名称和一个域名组成,以 @ 符号分隔. 例如,在 alice@leetcode.com中, alice 是本地名称,而 ...
- 【leecode】独特的电子邮件地址
每封电子邮件都由一个本地名称和一个域名组成,以 @ 符号分隔. 例如,在 alice@leetcode.com中, alice 是本地名称,而 leetcode.com 是域名. 除了小写字母,这些电 ...
- leetCode 929 独特的电子邮件地址
题目: 每封电子邮件都由一个本地名称和一个域名组成,以 @ 符号分隔. 例如,在 alice@leetcode.com中, alice 是本地名称,而 leetcode.com 是域名. 除了小写字母 ...
- Leetcode929.Unique Email Addresses独特的电子邮件地址
每封电子邮件都由一个本地名称和一个域名组成,以 @ 符号分隔. 例如,在 alice@leetcode.com中, alice 是本地名称,而 leetcode.com 是域名. 除了小写字母,这些电 ...
随机推荐
- 【每一个人都是梵高】A Neural Algorithm of Artistic Style
文章地址:A Neural Algorithm of Artistic Style 代码:https://github.com/jcjohnson/neural-style 这篇文章我认为可以起个浪漫 ...
- Windows 动态库创建和使用 part 2
一.Windows动态库的创建: 1.先选择 "DLL" 和 “控项目” 2.添加一个头文件,一个源文件 CppDll.h,CppDll.cpp,一个模块定义文件 CppDll. ...
- ASP.NET控件属性大全
ASP.NET控件属性大全 DataGridView 控件DataGridView 控件提供用来显示数据的可自定义表.使用 DataGridView 类,可以自定义单元格.行.列和边框. 注意Data ...
- android 开发积累
1.ListView滚动黑屏问题 ListView滚动时,数据项变成黑色 问题解决办法:通过添加 android:cacheColorHint = "#00000000" 将背景设 ...
- 【C++自我精讲】基础系列五 隐式转换和显示转换
[C++自我精讲]基础系列五 隐式转换和显示转换 0 前言 1)C++的类型转换分为两种,一种为隐式转换,另一种为显式转换. 2)C++中应该尽量不要使用转换,尽量使用显式转换来代替隐式转换. 1 隐 ...
- poj 3308(最小点权覆盖、最小割)
题目链接:http://poj.org/problem?id=3308 思路:裸的最小点权覆盖,建立超级源点和超级汇点,将源点与行相连,容量为这行消灭敌人的代价,将列与汇点相连,容量为这列消灭敌人的代 ...
- Delphi数据库处理
Delphi数据库处理 第一节 BDE.ADO.InterBase和dbExpress Delphi中处理数据库主要有两种方法,也就是BDE.ADO,从Delphi 6.0开始还加入了一种dbExpr ...
- hiho一下第109周《Tower Defense Game》
题目链接:传送门 题目大意:给你一棵树,根节点为1,树上每一个节点都有一个花费值和收入值(花费值>=收入值),要访问一个节点需先支付花费值,访问该节点结束后得到收入值 同时访问树时要求是有序的, ...
- Oracle Instant Client的安装和使用
转自:https://www.cnblogs.com/chinalantian/archive/2011/09/09/2172145.html 根据自己需求到Oracle网站(http://www.o ...
- java的double类型如何精确到一位小数?
java的double类型如何精确到一位小数? //分钟转小时vacationNum = (double)Math.round(vacationNum/60*10)/10.0;overTimeNum ...