I'd like to write a program that modify python programs in this way:
change
"开发者_如何学Pythonsome literal string %" % SOMETHING
to
functioncall("some literal string %") % SOMETHING
Thanks,
It may be simpler with tokenize -- adapting the example in the docs,
import cStringIO
import tokenize
class Lookahead(object):
def __init__(self, s):
self._t = tokenize.generate_tokens(cStringIO.StringIO(s).readline)
self.lookahead = next(self._t, None)
def __iter__(self):
return self
def next(self):
result = self.lookahead
if result is None: raise StopIteration
self.lookahead = next(self._t, None)
return result
def doit(s):
toks = Lookahead(s)
result = []
for toktype, tokvalue, _, _, _ in toks:
if toktype == tokenize.STRING:
pk = toks.lookahead
if pk is not None and pk[0] == tokenize.OP and pk[1] == '%':
result.extend([
(tokenize.NAME, 'functioncall'),
(tokenize.OP, '('),
(tokenize.STRING, repr(tokvalue)),
(tokenize.OP, ')')
])
continue
result.append((toktype, tokvalue))
return tokenize.untokenize(result)
print doit('"some literal string %" % SOMETHING')
This prints functioncall ('"some literal string %"')%SOMETHING
. The spacing is pretty peculiar (it takes much more effort to get the spacing just right -- but that is even worse for reconstructing sources from a modified AST), but it's just fine if all you're going to do is import / run the resulting code (not so fine if you want to get nicely readable and editable code -- but that's a big enough problem that I'd suggest a separate Q;-).
You can solve this with having to write a program. Instead simply use the best editor ever made: Emacs. Worth learning if you haven't already. With it you can solve this by using its regex-replace capability. The only trouble is that I rarely use regex's so I always forget the details of the cryptic syntax and have to still look it up :P I'll try to figure it again for you. Here's a link to the Search & Replace Info for Emacs - scroll down for using regex's
Here is another SO question that may be useful.
I gather that the ast
module doesn't have a facility for returning to source code, but Armin Ronacher has written a module codegen which implements a to_source
function for doing just that for ast
nodes.
I haven't tried to do this myself.
import re
pattern = r'(".+? %")(?= %)'
oldstr = '"some literal string %" % SOMETHING'
newstr = re.sub(pattern, r'functioncall(\1)', oldstr)
Try something like that. (Though with file I/O, of course.) I haven't worked with ast
at all yet, so I don't really know if using that would be any easier for something like this, but it seems to me that if you're just doing a simple search-replace and not actually doing a lot of complex parsing then there's no need to use ast
.
精彩评论