开发者

Using a regex in Java

开发者 https://www.devze.com 2023-01-31 03:10 出处:网络
I need to build a regular expression for all nonempty sequences of letters other than : file, for,from.

I need to build a regular expression for all nonempty sequences of letters other than : file, for,from.

So I should end up getting all the values from my text input excluding the above 3 words.

Is this a correct way to represent it?

^(?:(?!file|for|from).)*$

Also I was trying to use this regex pattern in my java program and assumed it should work. But it does not.

My sample code is as follows:

Pattern p = Pattern.compile("^(?:(?!file|for|from).)*$");

// Split input with the pattern

String[] result = 

         p.split("file is not there from for this time for this test");

for (int i=0; i<result.length; i++)

    System.out.println(result[i]);

Is there an error in my r开发者_Go百科egex or is there some error with the way I am using regex in java?

Please advise.

Thank you.


You should do something like this:

String s = "file is not there from for this time for this test";
String[] splits = s.split("file|from|for");


Unless your question learning RegEx, like Isac, I believe you are much better of simply splitting the string using a simple splitting algorithm and manually filtering away whatever entries you do not wish to find. String.split() is our friend. If the list of words to be ignored is large, consider keeping them in a HashTable and check each extracted word against this table. This will transform your algorithm from O(N^2) to O(N).

Not only will you be dealing with faster code that is easier to read, write and maintain (actually you haven't managed to get a working solution by now I guess).

My personal experience is, that most often your regular expression becomes very complex, hard to read and slow to evaluate due to the many unforseen backtracks due to the useg of .


There is obscurity in the question. Do you want to extract all the words of a string that aren’t one of (file,for,from) ? Or do you want to match the strings that don’t include any of these words ? ’for’ is refused , but what about ’affordable’ : accepted or refused ?

As far as I understand the question, I propose, to catch words not being ’file’,’for’ or ’from’, in a chain, the following RE chain:

'\b(?!file|for|from)\w+'

0

精彩评论

暂无评论...
验证码 换一张
取 消