what's best solution using regex, to remove special characters from the begin and the end of every word.
"as-df-- as-df- as-df (as-df) 'as-df' asdf-asdf) (asd-f asdf' asd-f' -asdf- %asdf%s asdf& $asdf$ +asdf+ asdf++ asdf''"
the output should be:
"as-df--开发者_运维知识库 as-df- as-df (as-df) as-df asdf-asdf) (asd-f asdf' asd-f' asdf %asdf%s asdf& asdf asdf asdf++ asdf''"
if the special character at the begin match with the end, remove it
i am learning about regex. [only regex]
import re
a = ("as-df-- as-df- as-df (as-df) 'as-df' asdf-asdf) (asd-f"
"asdf' asd-f' -asdf- %asdf%s asdf& $asdf$ +asdf+ asdf++ asdf''")
b = re.sub(r"((?<=\s)|\A)(?P<chr>[-()+%&'$])([^\s]*)(?P=chr)((?=\s)|\Z)",r"\3",a)
print b
Gives:
as-df-- as-df- as-df (as-df) as-df asdf-asdf) (asd-f
asdf' asd-f' asdf %asdf%s asdf& asdf asdf asdf++ asdf''
Getting non-identical characters to work is tricker ()
, []
, {}
For Perl, how about /\b([^\s\w])\w+\1\b/g
? Note things like \b don't work in all regex languages.
Oops, as @Nick pointed out, this doesn't work for non-identical pairs, like () [] etc.
Instead you could do:
s/\b([^\s\w([\]){}])\w+\1\b/\2/g
s/\b\((\w+)\)\b/\1/g
s/\b\[(\w+)\]\b/\1/g
s/\b\{(\w+)\}\b/\1/g
(untested)
精彩评论