
Efficient way to implement a String Array "is in" method using Java

开发者 https://www.devze.com 2023-02-13 09:50 出处:网络
I have a requirement to present highly structured information picked from a highly un-structured web service.In order to display the info correctly, I have to do a lot of String matches and duplicate

I have a requirement to present highly structured information picked from a highly un-structured web service. In order to display the info correctly, I have to do a lot of String matches and duplicate removals to ensure I'm picking the right combination of elements.

One of my challenges involves determining if a String is in an Array of Strings.

My dream is to do "searchString.isIn(stringArray);" but I realize the String class doesn't provide for that.

Is there a more efficient way of doing this beyond this stub?:

private boolean isIn(String searchString, String[] searchArray)
  for(String singleStr开发者_运维技巧ing : searchArray)
    if (singleString.equals(searchString)
      return true;

  return false;


You may want to look into HashMap or HashSet, both of which give constant time retrieval, and it's as easy as going:


Additionally, HashSet (and HashMap for its keys) prevents duplicate elements.

If you need to keep them in order of insertion, you can look into their Linked counterparts, and if you need to keep them sorted, TreeSet and TreeMap can help (note, however, that the TreeSet and TreeMap do not provide constant time retrieval).

Everybody else seems to be viewing this question in a broader scope (which is certainly valid). I am only answering this bit:

One of my challenges involves determining if a String is in an Array of Strings.

That's simple:

return Arrays.asList(arr).contains(str)



If you are doing this a lot, you can initially sort the array and do a binary search for your strings.

As mentioned a HashMap or HashSet can provide reasonable performance above what you've mentioned. It depends greatly on how well distributed your hash algorithm is and how many buckets are in the Map.

You could also keep a sorted list and perform a binary search on that list which could perform slightly better, though you pay the cost of sorting. If it's a one time sort, then that's not a big deal. If the list is constantly changing, you may pay a larger cost.

Lastly, you could consider a Trie structure. I think this would be the fastest way to search, but that's a gut reaction. I don't have the numbers to support that.

As explained before you can use a Set (see http://download.oracle.com/javase/1.5.0/docs/api/java/util/Set.html and specially the boolean contains(Object o) method) for that purpose. Here is a quick 'n dirty example that demonstrates this:

String[] a = {"a", "2"};
Set<String> hashSet = new HashSet<String>();
Collections.addAll(hashSet, a);
System.out.println(hashSet.contains("a"));  // Returns true
System.out.println(hashSet.contains("2"));  // Returns true
System.out.println(hashSet.contains("e"));  // Returns false

Hope this helps ;)

As Zach has pointed out , you can use hashset to prevent duplicate, and use contains method to search for a string , which returns true when a match is found.You also need to override equals in ur class.

public boolean equals(Object other) { return other != null && other instanceof L && this.l == ((L)other).l;

If the search space (your collection of strings) is limited than I agree with the answers already posted. If, however, you have a large collection of strings and need to perform a sufficient number of searches on it (to outweigh the setup overhead), you might also consider encoding the search strings in a trie data structure. Again this would only be advantageous if there are enough strings and you search enough times to justify the setup overhead.



验证码 换一张
取 消