I have a large text file (~10mb) that has more or less every dictionary in a specific language, and each word is new line deliminated.
I want to do a really fast lookup to see if a word exists in a file - What is the开发者_C百科 fastest way to do this without looping through each line?
It is sorted, and I can do all the pre-processing I want.
I considered doing some sort of Binary search, but I didnt know how I could do this, since all my lines are not a fixed number of bytes (and thus I wouldn't know where to jump the stream to). And surprisingly, I couldnt find a tool to do the fixed-width thing for me.
Any suggestions? Thanks!
I'd suggest building a Trie from the dictionary. That gives you very quick lookups to see whether a word is in there.
A trie is a good bet if you don't mind using some more storage: http://en.wikipedia.org/wiki/Trie
精彩评论