I'm wondering if it's possible to compare values in regexps with the regexp system in Python. Matching the pattern of an IP is easy, but each 1-3 digits cann开发者_JAVA百科ot be above 255 and that's where I'm a bit stumped.
No need for regular expressions here. Some background:
>>> import socket
>>> socket.inet_aton('255.255.255.255')
'\xff\xff\xff\xff'
>>> socket.inet_aton('255.255.255.256')
Traceback (most recent call last):
File "<input>", line 1, in <module>
error: illegal IP address string passed to inet_aton
>>> socket.inet_aton('my name is nobody')
Traceback (most recent call last):
File "<input>", line 1, in <module>
error: illegal IP address string passed to inet_aton
So:
import socket
def ip_address_is_valid(address):
try: socket.inet_aton(address)
except socket.error: return False
else: return True
Note that addresses like '127.1' could be acceptable on your machine (there are systems, including MS Windows and Linux, where missing octets are interpreted as zero, so '127.1' is equivalent to '127.0.0.1', and '10.1.4' is equivalent to '10.1.0.4'). Should you require that there are always 4 octets, change the last line from:
else: return True
into:
else: return address.count('.') == 3
You need to check the allowed numbers in each position. For the first optional digit, acceptable values are 0-2. For the second, 0-5 (if the first digit for that part is present, otherwise 0-9), and 0-9 for the third.
I found this annotated example at http://www.regular-expressions.info/regexbuddy/ipaccurate.html :
\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
You can check a 4-octet IP address easily without regexes at all. Here's a tested working method:
>>> def valid_ip(ip):
... parts = ip.split('.')
... return (
... len(parts) == 4
... and all(part.isdigit() for part in parts)
... and all(0 <= int(part) <= 255 for part in parts)
... )
...
>>> valid_ip('1.2.3.4')
True
>>> valid_ip('1.2.3.4.5')
False
>>> valid_ip('1.2. 3 .4.5')
False
>>> valid_ip('1.256.3.4.5')
False
>>> valid_ip('1.B.3.4')
False
>>>
Regex is for pattern matching, but to check for a valid IP, you need to check for the range (i.e. 0 <= n <= 255).
You may use regex to check for range, but that'll be a bit overkill. I think you're better off checking for basic patter and then check for the range for each number.
For example, use the following pattern to match an IP:
([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})
Then check whether each number is within range.
The following supports IPv4, IPv6 as well as Python 2.7 & 3.3
import socket
def is_valid_ipv4(ip_str):
"""
Check the validity of an IPv4 address
"""
try:
socket.inet_pton(socket.AF_INET, ip_str)
except AttributeError:
try:
socket.inet_aton(ip_str)
except socket.error:
return False
return ip_str.count('.') == 3
except socket.error:
return False
return True
def is_valid_ipv6(ip_str):
"""
Check the validity of an IPv6 address
"""
try:
socket.inet_pton(socket.AF_INET6, ip_str)
except socket.error:
return False
return True
def is_valid_ip(ip_str):
"""
Check the validity of an IP address
"""
return is_valid_ipv4(ip_str) or is_valid_ipv6(ip_str)
IP addresses can also be checked with split as follows,
all(map((lambda x: 0<=x<=255),map(int,ip.split('.')))) and len(ip.split("."))==4
For me thats a little bit more readable than regex.
I think people are taking this too far I suggest you first do this: ips = re.findall('(?:[\d]{1,3}).(?:[\d]{1,3}).(?:[\d]{1,3}).(?:[\d]{1,3})', page) then split the 4 numbers where there is a '.' and check and see if they are smaller than 256
You need this-
^((([1-9])|(0[1-9])|(0[0-9][1-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]))\.){3}(([1-9])|(0[1-9])|(0[0-9][1-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]))$
Debuggex Demo
精彩评论