I have a method
def strip_searchname(self, original_name):
taboo = {" and ", " of ", " at ", " in ", ":", "-", ",", " the ", " "}
searchname = original_name
for word in taboo:
print(searchname)
searchname = searchname.replace(word, "")
searchname = re.sub('[^a-zA-Z]', "", searchname)
searchname= searchname.upper()
return searchname
(yes, I know parts of it are redundant)
The first .replace seems to be stripping the entire string of whitespace, which I do NOT want. Why is this? How do I avoid it?
(e.g. output is:
Seattle University
Sea开发者_如何学GottleUniversity
SeattleUniversity
SeattleUniversity
SeattleUniversity
SeattleUniversity
SeattleUniversity
SeattleUniversity
SeattleUniversity
SEATTLEUNIVERSITY
)
What I DON'T understand is why it seems to be executing the " " replace BEFORE the " of " replace, for example, when the " of " replace comes before the space in the list.
It's not a list.
taboo = {" and ", " of ", " at ", " in ", ":", "-", ",", " the ", " "}
is a set literal. Try replacing { and } by [ and ] to get the order you want.
the replace method on a string replaces all occurences of the first argument with the second argument. In your loop, when the string word
equals " "
, the replace method will delete all occurrences of " "
in searchname
.
Maybe the problem is that taboo
is not a list, it is a set and sets do not keep the order.
See
>>> taboo = ['a', 'b', ' ']
>>> print taboo
['a', 'b', ' ']
>>> taboo = {'a', 'b', ' '}
>>> print taboo
set(['a', ' ', 'b'])
精彩评论