开发者

Can you fix this regex?

开发者 https://www.devze.com 2023-03-24 18:33 出处:网络
I\'m terrible with RegEx and found this bit somewhere on the interwebs.It\'s for matching Twitter-style @username but it has one small problem - it also accepts a space as a word.

I'm terrible with RegEx and found this bit somewhere on the interwebs. It's for matching Twitter-style @username but it has one small problem - it also accepts a space as a word.

NSRegularExpression *atRegex = [NSRegularExpression regularExpressionWithPattern:@"(?<!\\w)@([\\w\\._-]+)?" options:NSRegularExpressionCaseInsensitive error:&error];

Example: "@erik" is matched correctly, but "@ e开发者_C百科rik" is also matched and should not be.


Your regular expression contains

@(...)?

The ? at the end means that everything inside the preceding (...) is completely optional. So, your regex doesn't have to match anything following a @.

To fix this, you may be able to remove the ( )?, leaving:

"(?<!\\w)@[\\w\\._-]+"

However, you should also investigate what that (?<!\\w) is doing for you and whether you need it.


The reason @ erik is matched is most likely becuase your capturing group is:

([\\w\\._-]+)

That means one or more word characteres, periods, underscores, or dashes. So @ erik is matched sinced "erik" meets this criteria. The lookbehind asertion and the @ symbol are not being included in the match group, but they should be since they are the criteria for a match.

Try combining the zero-width negative lookbehind asertion you have

(?!<\\w)

which means any non-word character, into your capture group. It will not be included in the match, but will combine to mean "find a string of one or more word characters, periods, underscores, or dashes, following a non-word character and the "@" symbol. As Tim pointed out, this is to avoid email matches.

Try this:

"((?<!\\w)@[\\w\\._-]+)" 

*Please note that I am not an objective-c programmer, so I am not familiar enough with it to know if you need to write \\w instead of \w. In the flavors of regex I am used to, you would only use one escape character. Please consult your documentation if the above does not work.


@\S*

http://regexpal.com/ can really help in a bind

0

精彩评论

暂无评论...
验证码 换一张
取 消