开发者

How to do lookups using a list of string?

开发者 https://www.devze.com 2023-02-14 06:55 出处:网络
I have a list of strings (a List<String>) that can have anywhere from 1 to 6 entries. What I want to be able to do is use that list of string to do a lookup, but I want the possible lookups to b

I have a list of strings (a List<String>) that can have anywhere from 1 to 6 entries. What I want to be able to do is use that list of string to do a lookup, but I want the possible lookups to be able to use any combination of 2 or more of those strings to do the lookup. I was using a Dictionary<List<String>, String> currently.

ex. Suppose my list has the following in it: "fire", "aero", "thunder", "water", "blizzard" and I have the following entries in my dictionary:

List<String>(){"fire", "aero"}, "searing wind"
List<String>(){"fire", "aero", "thunder"} "firestorm"
List<String>(){"aero", "thunder"}, "storm"
List<String>(){"aero", "water", "blizzard"}, "snowstorm"
List<String>(){"aerora", "blizzara"}, "hailstorm"

I want the lookup to return the first 4 entries since my base list cont开发者_如何学编程ains all the values required to look them up. I also need to be able to know what values were used to do the lookup since those values will need to be cleared from the base list later. The number of entries in the dictionary will likely be ~400

I can think of an exhaustive way to do this lookup, but because the fact that the order is going to matter when doing the lookup, it's going to take along time to make all the permutations and look them up. I could enforce alphabetic order in the dictionary key lists, if that would help. Does anyone know of a better way to do this, or perhaps a different, more effective way to do this? I'm already using sqlite for some other stuff in this program so if thats going to give me quicker lookups I could use that.

Thanks


One option you may want to explore would be to use a decision tree. The idea would be something like this. Pick some arbitrary string, then split all of your sets into two groups - groups containing that string and groups not containing that string. Then, recursively repeat this procedure on both groups and construct a tree from all of the decisions you've made. For example, let's introduce a shorthand for your notation:

A = Aero

R = Aerora

F = Fire

T = Thunder

W = Water

B = Blizzard

Then you could build a tree like this:

start ──▶   A?  ── NO ──▶ R? ── YES ──▶ B? ── YES ──▶ "hailstorm"
             │
             └─── YES ──▶ F? ── YES ──▶ T? ── YES ──▶ "firestorm"
                          │             │
                          │             └───── NO ──▶ "searing wind"
                          │
                          └───── NO ──▶ T? ── YES ──▶ "storm"
                                        │
                                        └───── B? ── YES ──▶ "snowstorm"

Once you have a tree like this, you could store your attributes as a set of strings and then look up all matches as follows. Starting from the root of the tree, look at the string indicated by the given node. If that string is contained in your set of strings, then recursively continue down the YES branch and find all matches in that part of the tree. Then, regardless of whether you looked down that branch, explore down the NO branch to get all other strings that could match your query.

The advantage of this approach is that, assuming you have a small number of strings as keywords, the depth of the tree can be very small - at most O(k) for k keywords - and so in the best case your search will take as little as O(k) time. In the worst case, you just explore the entire tree, which takes time O(n). Moreover, using techniques from machine learning, it's possible to construct a very good tree structure that will have a solid tradeoff between size and lookup speed.

Hope this helps!

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号