I have this regular expression 开发者_Python百科to get urls:
(((ht|f)tp(s?))://)?(www.|[a-zA-Z].)[a-zA-Z0-9-.]+.(com|edu|gov|mil|net|org|biz|info|name|museum|us|ca|uk)(:[0-9]+)(/($|[a-zA-Z0-9.\,\;\?\'\+&%\$#\=~_-]+))
And I want to modify it so that when I call to make an array of the matched strings it will get everything before it as well. How can I do this?
Prepend ^(.*?) to the regular expression. That will set up a non-greedy match of all characters between the start of the input string and those matched by the rest of your expression.
精彩评论