I have the following aaaa_bb_开发者_如何学Gocc
string to match and written a regex pattern like
\\w{4}+\\_\\w{2}\\_\\w{2}
and it works. Is there any simple regex which can do this same ?
You don't need to escape the underscores:
\w{4}+_\w{2}_\w{2}
And you can collapse the last two parts, if you don't capture them anyway:
\w{4}+(?:_\w{2}){2}
Doesn't get shorter, though.
(Note: Re-add the needed backslashes for Java's strings, if you like; I prefer to omit them while talking about regular expressions :))
I sometimes do what I call "meta-regexing" as follows:
String pattern = "x{4}_x{2}_x{2}".replace("x", "[a-z]");
System.out.println(pattern); // prints "[a-z]{4}_[a-z]{2}_[a-z]{2}"
Note that this doesn't use \w
, which can match an underscore. That is, your original pattern would match "__________"
.
If x
really needs to be replaced with [a-zA-Z0-9]
, then just do it in the one place (instead of 3 places).
Other examples
- Regex for metamap in Java
- How do I convert CamelCase into human-readable names in Java?
Yes, you can use just \\w{4}_\\w{2}_\\w{2}
or maybe \\w{4}(_\\w{2}){2}
.
Looks like your \w does not need to match underscore, so you can use [a-zA-Z0-9]
instead
[a-zA-Z0-9]{4}_[a-zA-Z0-9]{2}_[a-zA-Z0-9]{2}
精彩评论