开发者

python - remove string from words in an array

开发者 https://www.devze.com 2023-02-04 03:16 出处:网络
#!/usr/bin/python #this looks for words in dictionary that begin with \'in\' and the suffix is a real word
#!/usr/bin/python
#this looks for words in dictionary that begin with 'in' and the suffix is a real word
wordlist = [line.strip() for line in open('/usr/share/dict/words')]
newlist = []
for word in wordlist:
    if word.startswith("开发者_StackOverflow中文版in"):
        newlist.append(word)
for word in newlist:
    word = word.split('in')
print newlist

how would I get the program to remove the string "in" from all the words that it starts with? right now it does not work


#!/usr/bin/env python

# Look for all words beginning with 'in'
# such that the rest of the word is also
# a valid word.

# load the dictionary:
with open('/usr/share/dict/word') as inf:
    allWords = set(word.strip() for word in inf)  # one word per line
  1. using 'with' ensures the file is always properly closed;
  2. I make allWords a set; this makes searching it an O(1) operation

then we can do

# get the remainder of all words beginning with 'in'
inWords = [word[2:] for word in allWords if word.startswith("in")]
# filter to get just those which are valid words
inWords = [word for word in inWords if word in allWords]

or run it into a single statement, like

inWords = [word for word in (word[2:] for word in allWords if word.startswith("in")) if word in allWords]

Doing it the second way also lets us use a generator for the inside loop, reducing memory requirements.


split() returns a list of the segments obtained by splitting. Furthermore,

word = word.split('in')

doesn't modify your list, it just modifies the variable being iterated.

Try replacing your second loop with this:

for i in range(len(newlist)):
    word = newlist[i].split('in', 1)
    newlist[i] = word[1]


It's difficult to tell from your question what you want in newlist if you just want words that start with "in" but with "in" removed then you can use a slice:

newlist = [word[2:] for word in wordlist if word.startswith('in')]

If you want words that start with "in" are still in wordlist once they've had "in" removed (is that what you meant by "real" in your comment?) then you need something a little different:

newlist = [word for word in wordlist if word.startswith('in') and word[2:] in wordlist

Note that in Python we use a list, not an "array".


Suppose that wordlist is the list of words. Following code should do the trick:

for i in range(len(wordlist)):
    if wordlist[i].startswith("in"):
        wordlist[i] = wordlist[i][2:]

It is better to use while loop if the number of words in the list is quite big.

0

精彩评论

暂无评论...
验证码 换一张
取 消