I am completely new to regular expressions so I'm looking for a bit of help here.
I am compiling under JDK 1.5
Take this line as an example that I read from standard input:
ab:Some string po:bubblegum
What I would like to do is split by the two characters and colon. That is, once the line is split and put into a string array, these should be the terms:
ab:Some string
po:b开发者_StackOverflow中文版ubblegum
I have this regex right now:
String[] split = input.split("[..:]");
This splits at the semicolon; what I would like is for it to match two characters and a semicolon, but split at the space before that starts. Is this even possible?
Here is the output from the string array:
ab
Some String po
bubblegum
I've read about Pattern.compile() as well. Is this something I should be considering?
input.split(" (?=[A-Za-z]{2}:)")
The ?= creates a positive lookahead. This means the engine looks ahead to see if the next part matches, without consuming that part. If it does match, it splits on the space character. [A-Za-z]
means a upper or lower-case letter, while {2}
specifies we want two characters matching that class.
You asked about Pattern#compile(String pattern)
. You should consider using it if you are going to use the regex a lot since the aforementioned method compiles the regex into something that's fast to execute while using String#split(String regex)
directly always recompiles the regex.
精彩评论