开发者

How can I have two wildcards in this regex expression?

开发者 https://www.devze.com 2023-03-03 14:20 出处:网络
trying to get the following regex: <- bad english from me :( I\'m trying to get the following 开发者_JS百科input text converted as regex...

trying to get the following regex: <- bad english from me :(

I'm trying to get the following 开发者_JS百科input text converted as regex...

xx.*.aaa.bbb*

where * are wildcards .. as in .. they represent wildcards to me .. not regex syntax.

Any suggestions, please?

Update - example inputs.

  • xx.zzzzzzzzz.aaa.bbb = match
  • xx.eee.aaa.bbbzzzz = match
  • xx.eee.aaa.bbb.zzzz = match
  • xx.aaa.bbb = not a match


You misunderstood the concept of * in Regular Expressions.

I think what you are looking for is:

xx\..*\.aaa\.bbb.*

The thing is:

  • a . is not a real .. It means any character, so if you want to match a . you must escape it: \.
  • * means that the character that preceeds it will be matched 0 or many times, so how to emulate the wildcard you are looking for? Using .*. It will match any character 0 or many times.

If you want to match exactly the entire string, and not any substring that matches the pattern, you have to include ^ at the begining and $ at the end, so your regex will be:

^xx\..*\.aaa\.bbb.*$


Try this expression:

^xx\.[^\.]+\.aaa\.bbb.*


Assuming that you're saying that * is a wildcard in the 'normal sense', and that your string isn't an attempt at regex, I'd say that xx\..+\.aaa\.bbb.+ is what you're after.


What you refer to as "wildcard -- not regex syntax" is from globbing. It's a pattern matchnig technique that was popularized in the first Unix version in the late 60's. Originally it was a separate program -- called glob -- that produced a result that could be piped to other programs. Now bash, MS-Dos and almost any shell has this feature built-in. In globbing * normally means match any character, any number of times.

The regex syntax is different. The .* idiom in regex is similar to the * in globbing, but not exactly the same. Normally, .* doesn't match line-breaks. You usually have to set the single-line mode (in Ruby called multi line) if you want .* to match any character, any number of times in regex.


* are not wildcards, they mean the preceeding character is repeated 0 or 1 or many times.

And the dot can be any character.

UPDATE:

You can try this

^xx\.[a-z]+\.aaa\.bbb\.?[a-z]*

and you can test it for example here online on rubular

The [a-z] are character groups, within you can define what character is allowed (or not allowed using [^a-z]). so if you are only looking for lowercase letters then you can use [a-z].

The + means it has to there at least once.

The \.? near the end means there can be a dot or not

The ^ at the beginning means to match at the start of the string

A nice tutorial (for Perl, but at least the basics are the same nearly everywhere) is the PerlReTut

0

精彩评论

暂无评论...
验证码 换一张
取 消