开发者

python string replace is deleting whitespace incorrectly

开发者 https://www.devze.com 2023-04-01 16:46 出处:网络
I have a method def strip_searchname(self, original_name): taboo = {\" and \", \" of \", \" at \", \" in \", \":\", \"-\", \",\", \" the \", \" \"}

I have a method

def strip_searchname(self, original_name):
    taboo = {" and ", " of ", " at ", " in ", ":", "-", ",", " the ", " "}
    searchname = original_name
    for word in taboo:
        print(searchname)
        searchname = searchname.replace(word, "")
    searchname = re.sub('[^a-zA-Z]', "", searchname)
    searchname= searchname.upper()
    return searchname

(yes, I know parts of it are redundant)

The first .replace seems to be stripping the entire string of whitespace, which I do NOT want. Why is this? How do I avoid it?

(e.g. output is:

Seattle University
Sea开发者_如何学GottleUniversity
SeattleUniversity
SeattleUniversity
SeattleUniversity
SeattleUniversity
SeattleUniversity
SeattleUniversity
SeattleUniversity
SEATTLEUNIVERSITY

)


What I DON'T understand is why it seems to be executing the " " replace BEFORE the " of " replace, for example, when the " of " replace comes before the space in the list.

It's not a list.

taboo = {" and ", " of ", " at ", " in ", ":", "-", ",", " the ", " "}

is a set literal. Try replacing { and } by [ and ] to get the order you want.


the replace method on a string replaces all occurences of the first argument with the second argument. In your loop, when the string word equals " ", the replace method will delete all occurrences of " " in searchname.


Maybe the problem is that taboo is not a list, it is a set and sets do not keep the order.

See

>>> taboo = ['a', 'b', ' ']
>>> print taboo
['a', 'b', ' ']
>>> taboo = {'a', 'b', ' '}
>>> print taboo
set(['a', ' ', 'b'])
0

精彩评论

暂无评论...
验证码 换一张
取 消