I have a string like this
This:string:must~:be:split:when:previous:char:is:not~:this
I need to split the line with the delimiter ":" but only if the character before the delimiter is NOT "~"
I have the following regex now:
String[] split = str.split(":(?<!~:)");
It works, but since I arrived at it purely by trial and error, I'm not convinced that its the most efficie开发者_开发问答nt way of doing it. Also, this function will be repeatedly called on large strings frequently, so performance does come into consideration. What is a more efficient way of doing it?
A slightly simpler approach is this:
(?<!~):
That way you don't match :
twice. I doubt you'll see any difference in performances though. It is also very simple to write without a regular expression by simply looking for the next colon, and checking for tilde before it.
Update: To make this more fair I wanted to use a compiled Pattern and see the results of that. So I updated the code to use compiled pattern, non-compiled pattern and my custom method.
While this isn't using regex it proves to be faster then the regex given.
public static void main(String[] args) {
Pattern pattern = Pattern.compile(":(?<!~:)");
for (int runs = 0; runs < 4; ++runs) {
long start = System.currentTimeMillis();
for (int index = 0; index < 100000; ++index) {
"This:string:must~:be:split:when:previous:char:is:not~:this".split(":(?<!~:)");
}
long stop = System.currentTimeMillis();
System.out.println("Run: " + runs + " Regex: " + (stop - start));
start = System.currentTimeMillis();
for (int index = 0; index < 100000; ++index) {
pattern.split("This:string:must~:be:split:when:previous:char:is:not~:this");
}
stop = System.currentTimeMillis();
System.out.println("Run: " + runs + " Compiled Regex: " + (stop - start));
start = System.currentTimeMillis();
for (int index = 0; index < 100000; ++index) {
specialSplit("This:string:must~:be:split:when:previous:char:is:not~:this");
}
stop = System.currentTimeMillis();
System.out.println("Run: " + runs + " Custom: " + (stop - start));
}
for (String s : specialSplit("This:string:must~:be:split:when:previous:char:is:not~:this")) {
System.out.println(s);
}
}
public static String[] specialSplit(String text) {
List<String> stringsAfterSplit = new ArrayList<String>();
StringBuilder splitString = new StringBuilder();
char previousChar = 0;
for (int index = 0; index < text.length(); ++index) {
char charAtIndex = text.charAt(index);
if (charAtIndex == ':' && previousChar != '~') {
stringsAfterSplit.add(splitString.toString());
splitString.delete(0, splitString.length());
} else {
splitString.append(charAtIndex);
}
previousChar = charAtIndex;
}
if (splitString.length() > 0) {
stringsAfterSplit.add(splitString.toString());
}
return stringsAfterSplit.toArray(new String[stringsAfterSplit.size()]);
}
Output
Run: 0 Regex: 468
Run: 0 Compiled Regex: 365
Run: 0 Custom: 169
Run: 1 Regex: 437
Run: 1 Compiled Regex: 363
Run: 1 Custom: 166
Run: 2 Regex: 445
Run: 2 Compiled Regex: 363
Run: 2 Custom: 167
Run: 3 Regex: 436
Run: 3 Compiled Regex: 361
Run: 3 Custom: 167
This
string
must~:be
split
when
previous
char
is
not~:this
Try this one.
[^~]:
Tested in JS
精彩评论