I 开发者_如何学Pythonneed to be able to tell the difference between a string that can contain letters and numbers, and a string that can contain numbers, colons and hyphens.
>>> def checkString(s):
... pattern = r'[-:0-9]'
... if re.search(pattern,s):
... print "Matches pattern."
... else:
... print "Does not match pattern."
# 3 Numbers seperated by colons. 12, 24 and minus 14
>>> s1 = "12:24:-14"
# String containing letters and string containing letters/numbers.
>>> s2 = "hello"
>>> s3 = "hello2"
When I run the checkString
method on each of the above strings:
>>>checkString(s1)
Matches Pattern.
>>>checkString(s2)
Does not match Pattern.
>>>checkString(s3)
Matches Pattern
s3 is the only one that doesn't do what I want. I'd like to be able to create a regex that allows numbers, colons and hyphens, but excludes EVERYTHING else (or just alphabetical characters). Can anyone point me in the right direction?
EDIT:
Therefore, I need a regex that would accept:
229 // number
187:657 //two numbers
187:678:-765 // two pos and 1 neg numbers
and decline:
Car //characters
Car2 //characters and numbers
you need to match the whole string, not a single character as you do at the moment:
>>> re.search('^[-:0-9]+$', "12:24:-14")
<_sre.SRE_Match object at 0x01013758>
>>> re.search('^[-:0-9]+$', "hello")
>>> re.search('^[-:0-9]+$', "hello2")
To explain regex:
- within square brackets (character class): match digits 0 to 9, hyphen and colon, only once.
+
is a quantifier, that indicates that preceding expression should be matched as many times as possible but at least once.^
and$
match start and end of the string. For one-line strings they're equivalent to\A
and\Z
.
This way you restrict content of the whole string to be at least one-charter long and contain any permutation of characters from the character class. What you were doing before hand was to search for a single character from the character class within subject string. This is why s3
that contains a digit matched.
SilentGhost's answer is pretty good, but take note that it would also match strings like "---::::"
with no digits at all.
I think you're looking for something like this:
'^(-?\d+:)*-?\d+$'
^
Matches the beginning of the line.(-?\d+:)*
Possible - sign, at least one digit, a colon. That whole pattern 0 or many times.-?\d+
Then the pattern again, at least once, without the colon$
The end of the line
This will better match the strings you describe.
pattern = r'\A([^-:0-9]+|[A-Za-z0-9])\Z'
Your regular expression is almost fine; you just need to make it match the whole string. Also, as a commenter pointed out, you don't really need a raw string (the r
prefix on the string) in this case. Voila:
def checkString(s):
if re.match('[-:0-9]+$', s):
print "Matches pattern."
else:
print "Does not match pattern."
The '+' means "match one or more of the previous expression". (This will make checkString return False on an empty string. If you want True on an empty string, change the '+' to a '*'.) The '$' means "match the end of the string".
re.match means "the string must match the regular expression starting at the first character"; re.search means "the regular expression can match a sequence anywhere inside the string".
Also, if you like premature optimization--and who doesn't!--note that 're.match' needs to compile the regular expression each time. This version compiles the regular expression only once:
__checkString_re = re.compile('[-:0-9]+$')
def checkString(s):
global __checkString_re
if __checkString_re.match(s):
print "Matches pattern."
else:
print "Does not match pattern."
精彩评论