I read the artist of a song from its MP3 tag, then create a folder based on that name. The problem I have is when the name contains a special character like 'AC\DC'. So I wrote this code to deal with that.
def replace_all(text):
print "replace_all"
dictionary = {'\\':"", '?':"", '/':"", '...':"", ':':"", chr(148):"o"}
for i, j in dictionary.iteritems():
text = text.replace(i,j)
return text
What I am running into now is how to deal with non-english characters like an umlaout o i开发者_StackOverflown Motorhead or Blue Oyster cult.
As you see I tried adding the ascii-string version of umlaout o at the end of the dictionary but that failed with
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)
I found this code, though I don't understand it.
def strip_accents(s):
return ''.join((c for c in unicodedata.normalize('NFD', s) if unicodedata.category(c) != 'Mn'))
It enabled me to remove the accent marks from the path of proposed dir/filenames.
I suggest using unicode for both input text and the chars replaced. In your example chr(148)
is clearly not a unicode symbol.
精彩评论