I am writing a piece of code in which i have to find only complete words for example if i have
String str = "today is tuesday";
and I'm searching for "t" then I should not find any word.
Can anybody tell how can I write such a pro开发者_如何学Gogram in java?
I use a regexps for such tasks. In your case it should look something like this:
String str = "today is tuesday";
return str.matches(".*?\\bt\\b.*?"); // returns "false"
String str = "today is t uesday";
return str.matches(".*?\\bt\\b.*?"); // returns "true"
A short explanation:
. matches any character, *? is for zero or more times, \b is a word boundary.
More information on regexps can be found here or specifically for java here
String sentence = "Today is Tuesday";
Set<String> words = new HashSet<String>(
Arrays.asList(sentence.split(" "))
);
System.out.println(words.contains("Tue")); // prints "false"
System.out.println(words.contains("Tuesday")); // prints "true"
Each contains(word)
query is O(1)
, so short of implementing your own sophisticated dictionary data structure, this is the fastest most practical solution if you have many words to look for in a text.
This uses String.split
to separate out the words from the sentence on the " "
delimiter. Other possible variations, depending on how the problem is defined, is to use \b
, the word boundary anchor. The problem is considerably more difficult if you must take every grammatical features of natural languages into consideration (e.g. "can't"
is split by \b
into "can"
and "t"
).
Case insensitivity can be easily introduced by using the traditional case normalization trick: split and hash sentence.toLowerCase()
instead, and see if it contains(word.toLowerCase())
.
See also
- regular-expressions.info -- Anchors
- Wikipedia -- String searching algorithm
- Wikipedia -- Patricia Trie
String[] tokens = str.split(" ");
for(String s: tokens) {
if ("t".equals(s)) {
// t exists
break;
}
}
String[] words = str.split(" ");
Arrays.sort(words);
Arrays.binarySearch(words, searchedFor);
String str = "today is tuesday";
StringTokenizer stringTokenizer = new StringTokenizer(str);
bool exists = false;
while (stringTokenizer.hasMoreTokens()) {
if (stringTokenizer.nextToken().equals("t")) {
exists = true;
break;
}
}
use a regex like "\bt\b".
you can do that by putting a regex which should end with a space.
I would recommend you use the "split" functionality for String with spaces as separators, then go through these elements one by one and make a direct comparison.
I would suggest using this regex pattern1 = ".\bt\b." instead of pattern2 = ".?\bt\b.?" . Pattern1 will help you to match the complete String if 't' occurs in that string rather than the pattern2 which just reaches the string "t" you are searching for and ignores rest of the string. There is not much difference in two approaches and for your particular use case of returning true/false will run fine both the ways. The one I suggested will help you to improvise the regex in case you make further changes in your use case
精彩评论