开发者

python: cleaning up a string

开发者 https://www.devze.com 2023-01-14 06:01 出处:网络
i have a string like this somestring=\'in this/ string / i have many. interesting.occurrences of {different chars} that needto .be removed\'

i have a string like this

somestring='in this/ string / i have many. interesting.occurrences of {different chars} that need     to .be removed  '

here is the result i want:

somestring='in this string i have many interesting occurrences of different chars that need to be removed'

i started to manually do all kinds of .replace, but there are so many different combinations that i think there must be a simpler way. perhaps ther开发者_如何转开发e's a library that already does this?

does anyone know how i can clean up this string>?


I would use regular expression to replace all non-alphanumerics to spaces:

>>> import re
>>> somestring='in this/ string / i have many. interesting.occurrences of {different chars} that need     to .be removed  '
>>> rx = re.compile('\W+')
>>> res = rx.sub(' ', somestring).strip()
>>> res
'in this string i have many interesting occurrences of different chars that need to be removed'


You have two steps: remove the punctuation then remove the extra whitespace.

1) Use string.translate

import string
trans_table = string.maketrans( string.punctuation, " "*len(string.punctuation)
new_string = some_string.translate(trans_table)

This makes then applies a translation table that maps punctuation characters to whitespace.

2) Remove excess whitespace

new_string = " ".join(new_string.split())


re.sub('[\[\]/{}.,]+', '', somestring)
0

精彩评论

暂无评论...
验证码 换一张
取 消