I need to come up with a regular expression for a mini-project.
The string should not start with:
"/wiki"
and it should also not have the following pattern
"/.*:.*"
(basically pattern starts with char '/' and there is any occurrence of ':' after that)
and it also cannot have a certa开发者_运维百科in character '#'
So basically all these strings would fail:
"/wiki/index.php?title=ROM/TAP&action=edit&section=2"
"/User:romamns"
"/Special:Watchlist"
"/Space_Wiki:Privacy_policy"
"#column-one"
And all these string would pass:
"/ROM/TAP/mouse"
"http://www.boost.org/"
I will be using the regex in python (if that makes any difference).
Thanks for any help.
^(/(?!wiki)[^:#]*|[^#/][^#]*)$
should be ok, as tested here, of course I might be missing something, but this appears to follow your specification.
This tested script implements a commented regex which precisely matches your stated requirements:
import re
def check_str(subject):
"""Retturn True if subject matches"""
reobj = re.compile(
""" # Match special string
(?!/wiki) # Does not start with /wiki.
(?![^/]*/[^:]*:) # Does not have : following /
[^#]* # Match whole string having no #
$ # Anchor to end of string.
""",
re.IGNORECASE | re.MULTILINE | re.VERBOSE)
if reobj.match(subject):
return True
else:
return False
return False
data_list = [
r"/wiki/index.php?title=ROM/TAP&action=edit&section=2",
r"/User:romamns",
r"/Special:Watchlist",
r"/Space_Wiki:Privacy_policy",
r"#column-one",
r"/ROM/TAP/mouse",
r"http://www.boost.org/",
]
cnt = 0
for data in data_list:
cnt += 1
print("Data[%d] = \"%s\"" %
(cnt, check_str(data)))
If you match the following regular expression, then it should fail
^(\/wiki|.*?[\:#])
精彩评论