Python Challenge # 2 = removing characters from a string_问答_开发者

I have the code:

theory = """}#)$[]_+(^_@^][]_)*^*+_!{&$##]((](}}{[!$#_{&{){
*_{^}$#!+]{[^&++*#!]*)]%$!{#^&%(%^*}@^+__])_$@_^#[{{})}$*]#%]{}{][@^!@)_[}{())%)
())&#@*[#}+#^}#%!![#&*}^{^(({+#*[!{!}){(!*@!+@[_(*^+*]$]+@+*_##)&a开发者_如何转开发mp;)^(@$^]e@][#&)(
%%{})+^$))[{))}&$(^+{&(#%*@&*(^&{}+!}_!^($}!(}_@@++$)(%}{!{_]%}$!){%^%%@^%&#([+[
_+%){{}(#_}&{&++!@_)(_+}%_#+]&^)+]_[@]+$!+{@}$^!&)#%#^&+$@[+&+{^{*[@]#!{_*[)(#[[
]*!*}}*_(+&%{&#$&+*_]#+#]!&*@}$%)!})@&)*}#(@}!^(]^@}]#&%)![^!$*)&_]^%{{}(!)_&{_{
+[_*+}]$_[#@_^]*^*#@{&%})*{&**}}}!_!+{&^)__)@_#$#%{+)^!{}^@[$+^}&(%%)&!+^_^#}^({
*%]&@{]++}@$$)}#]{)!+@[^)!#[%@^!!"""

#theory = open("temp.txt")

key = "#@!$%+{}[]_-&*()*^@/"
new2 =""

print()
for letter in theory:
    if letter not in key:
        new2 += letter

print(new2)

This is a test piece of code to solve the python challenge #2: http://www.pythonchallenge.com/pc/def/ocr.html

The only trouble is, the code I wrote seems to leaves lots of whitespace but I'm not sure why.

Any ideas on how to remove the unnecessary white? In other words I want the code to return "e" not " e ".

The challenge is to find a rare character. You could use collections.Counter for that:

from collections import Counter

c = Counter(theory)
print(c.most_common()[-1])

Output

('e', 1)

The unnecessary whitespace could be removed using .strip():

new2.strip()

Adding '\n' to the key works too.

The best would be to use regular expression library, like so

import re
characters = re.findall("[a-zA-Z]", sourcetext)
print ("".join(characters))

In a resulting string you will have ONLY an alphabetic characters.

If you look at the distribution of characters (using collections.Counter), you get:

6000+ each of )@(]#_%[}!+$&{*^ (which you are correctly excluding from the output)
1220 newlines (which you are not excluding from the output)
1 each of — no, I'm not going to give away the answer

Just add \n to your key variable to exclude the unwanted newlines. This will leave you with just the rare (i.e., 1 occurrence only) characters you need.

P.S., it's highly inefficient to concatenate strings in a loop. Instead of:

new2 =""

for letter in theory:
    if letter not in key:
        new2 += letter

write:

new2 = ''.join(letter for letter in theory if letter not in key)

The theory string contains several newlines. They get printed by your code. You can either get rid of the newline, like this:

theory = "}#)$[]_+(^_@^][]_)*^*+_!{&$##]((](}}{[!$#_{&{){" \
"*_{^}$#!+]{[^&++*#!]*)]%$!{#^&%(%^*}@^+__])_$@_^#[{{})}$*]#%]{}{][@^!@)_[}{())%)" \
"())&#@*[#}+#^}#%!![#&*}^{^(({+#*[!{!}){(!*@!+@[_(*^+*]$]+@+*_##)&)^(@$^]e@][#&)(" \
"%%{})+^$))[{))}&$(^+{&(#%*@&*(^&{}+!}_!^($}!(}_@@++$)(%}{!{_]%}$!){%^%%@^%&#([+[" \
"_+%){{}(#_}&{&++!@_)(_+}%_#+]&^)+]_[@]+$!+{@}$^!&)#%#^&+$@[+&+{^{*[@]#!{_*[)(#[[" \
"]*!*}}*_(+&%{&#$&+*_]#+#]!&*@}$%)!})@&)*}#(@}!^(]^@}]#&%)![^!$*)&_]^%{{}(!)_&{_{" \
"+[_*+}]$_[#@_^]*^*#@{&%})*{&**}}}!_!+{&^)__)@_#$#%{+)^!{}^@[$+^}&(%%)&!+^_^#}^({" \
"*%]&@{]++}@$$)}#]{)!+@[^)!#[%@^!!"

or your can filter them out, like this:

key = "#@!$%+{}[]_-&*()*^@/\n"

Both work fine (yes, I tested).

a simpler way to output the answer is to:

print ''.join([ c for c in theory if c not in key])

and in your case you might want to add the newline character to key to also filter it out:

key += "\n"

You'd better work in reverse, something like this:

out = []                                                                                                       
for i in theory:                                                                                                 
  a = ord(i)                                                                                                   
  if (a > 96 and a < 122) or (a > 65 and a < 90):                                                              
    out.append(chr(a))                                                                                         

print ''.join(out)

Or better, use a regexp.