开发者

Index by word length

开发者 https://www.devze.com 2023-03-16 13:50 出处:网络
My aim was to simply make a hangman game. However, I have been slightly over-ambitious. I want to ask the user to input how long they want the word. Then choose a random word of that length. To index

My aim was to simply make a hangman game. However, I have been slightly over-ambitious. I want to ask the user to input how long they want the word. Then choose a random word of that length. To index an entire dictionary of that length would take far too long on each iteration. So. I have a dictionary, formatted like so:

zymosans

zymoscope

zymoses

...

I would like to be able output a file for each 'length of word' automatically using this program. Like this:

1letterwords.txt

2letterwords.txt

and so forth.

I started python...yesterday. I searched both the web and this site and came up with nothing.开发者_如何学Python I would like some pointers as to how to start with this specific programming problem. Thanks in advance! (To clarify, the hangman game would open a random line in the requested wordlength file, reducing performance impact...fairly dramatically.)


It's really not that big of a deal to load an entire dictionary into memory. You can try something like this:

import random
from collections import defaultdict

# load words
index = defaultdict(list)
with open('words.txt') as file:
    for line in file:
        word = line.strip().lower()
        index[len(word)].append(word)

# pick a random word
length = int(raw_input('Enter word length: '))
word = random.choice(index[length])

And if you insist on generating separate files by word length, run this code after loading the index as shown above:

for length in sorted(index):
    path = 'words%d.txt' % length
    with open(path, 'w') as file:
        for word in index[length]:
            file.write('%s\n' % word)


Getting random lines of files is probably not what you want to do either ... keeping them in a list and/or dict should be fine even for millions of words.

you can store list of words by their length by iterating over all your words and adding them to a list seeded defaultdict:

from collections import defaultdict
import random

wordsByLength = defaultdict( list )
for word in allWords:
    wordsByLength[ len(word) ].append( word )

Then whenever you need a random word you can do:

randomLen = random.choice( wordsByLength.keys() )
randomWord = random.choice( wordsByLength[ randomLen ] )

Or you can replace randomLen with the specified length you wanted.


e.g.

url = urllib.urlopen('http://download.oracle.com/javase/tutorial/collections/interfaces/examples/dictionary.txt')
random.choice([item for item in url if len(item) == 8])


Sure, the simple way isn't that efficient, but is it really too slow?

In [1]: import random

In [2]: timeit words = list(open("sowpods.txt"))
10 loops, best of 3: 48.4 ms per loop

In [3]: words = list(open("sowpods.txt"))

In [4]: len(words)
Out[4]: 267751

In [5]: timeit random.choice([w for w in words if len(w.strip())==6])
10 loops, best of 3: 62.5 ms per loop

In [6]: random.choice([w for w in words if len(w.strip())==6])
Out[6]: 'NAPKIN\r\n'

The one liner version only takes less than a 10th of a second on this computer

In [7]: timeit random.choice([w for w in open("sowpods.txt") if len(w.strip())==6])
10 loops, best of 3: 91.2 ms per loop

In [8]: random.choice([w for w in open("sowpods.txt") if len(w.strip())==6])
Out[8]: 'REVEUR\r\n'

You can add a .strip() to that to get rid of the '\r\n' on the end

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号