I have a huge array list which contains 1000 entries out of which one of the entry is "world". And, I have a word "big world". I want to get the word "big world" matched with "world" in the arraylist.
What is the most cost effective way of doing it? I cannot use .contains method of array list, and If I traverse all the 1000 entries and match them by pattern its going to be very costly in terms of performance. I am using Java for this开发者_如何学Go.
Could you please let me know what is the best way for this?
Cheers, J
You can split up every single element of the ArrayList
into words and stop as soon as you find one of them.
I suppose by your profile you develop in Java, with Lucene you would easily do something like that
public class NodesAnalyzer extends Analyzer {
public TokenStream tokenStream(String fieldName, Reader reader) {
Tokenizer tokenizer = new StandardTokenizer(reader)
TokenFilter lowerCaseFilter = new LowerCaseFilter(tokenizer)
TokenFilter stopFilter = new StopFilter(lowerCaseFilter, Data.stopWords.collect{ it.text } as String[])
SnowballFilter snowballFilter = new SnowballFilter(stopFilter, new org.tartarus.snowball.ext.ItalianStemmer())
return snowballFilter
}
}
Analyzer analyzer = new NodesAnalyzer()
TokenStream ts = analyzer.tokenStream(null, new StringReader(str));
Token token = ts.next()
while (token != null) {
String cur = token.term()
token = ts.next();
}
Note: this is Groovy code that I copied from a personal project so you will have to translate things like Data.stopWords.collect{ it.text } as String[]
to use with plain Java
Assuming you dont know the content of the arraylist elements. you will have to traverse the whole arraylist.
Traversing the arraylist would cost you O(n).
Sorting the arraylist wouldnt help you because you are talking about a searching a string in a set of strings. and still sorting would be more expensive. O(nlogn)
If you have to search the list repeatedly, it may make sense to use the sort()
and binarySearch()
methods of Collections
.
Addendum: As noted by @user177883, the cost of an O(n log n) sort must be weighed against the benefit of subsequent O(log n) searches.
The word "heart" matches the [word] "ear".
As an exact match is insufficient, this approach would be inadequate.
I had a very similar issue.
Solved it by using this if
/else if
statement.
if (myArrayList.contains(wordThatIsEntered)
&& wordThatCantBeMatched.equals(wordThatIsEntered)) {
Toast.makeText(getApplicationContext(),
"WORD CAN'T BE THE SAME OR THAT WORD ISN'T HERE",
Toast.LENGTH_SHORT).show();
}
else if (myArrayList.contains(wordThatIsEntered)) {
Toast.makeText(getApplicationContext(),
"FOUND THE EXACT WORD YOU ARE LOOKING FOR!",
Toast.LENGTH_SHORT).show();
}
精彩评论