I have this pattern:
/([^>'"])(http|ftp)+(s)?:(\/\/)((\w|\.)+)(\/)?(\S+)?/
when using this as a subject:
http://www.google.com <a href="http://www.google.com">开发者_JS百科http://www.google.com</a> http://www.google.com
It matches the last http://www.google.com
but not the first one at the start of the line. How can I get it to match the first one at the start of the line too? (and continue to not match inside the anchor tag)
It's because [^'">]
means any one character that isn' '
, "
or >
. There is no one character before the http
at the start of the line, which is why it's not matching.
One possibility (not necessarily the best), is to use something like:
(([^'">])(http))|(^http)
(either of two possible patterns). This basically means to give me all those you currently specify as well as "http" at the start of the line.
I don't doubt there are trickier ways to do this with the more advanced regex features like look-ahead, negative look-behind or the little known surreptitious look-under (a), but I prefer simplicity most of the time.
(a) Some features alluded to in this answer may not, in fact, exist :-)
/(^|[^>'"])(http|ftp)+(s)?:(\/\/)((\w|\.)+)(\/)?(\S+)?/
will do it for you. ^
inside []
will negate the rest of the characters. You have to keep ^
at the starting of the regex and outside of []
to match the start of the line
try ([^'">])?(http)
(untested)
精彩评论