Does anyone have some good resources on learning more advanced regular expressions
I keep having problems where I want开发者_开发问答 to make sure something is not enclosed in quotation marks
i.e. I am trying to make an expression that will match lines in a python file containing an equality, i.e.
a = 4
which is easy enough, but I am having trouble devising an expression that would be able to separate out multiple terms or ones wrapped in quotes like these:
a, b = b, a
a,b = "You say yes, ", "i say no"
Parsing code with regular expressions is generally not a good idea, as the grammar of a programming language is not a regular language. I'm not much of a python programmer, but I think you would be a lot better off parsing python code with python modules such as this one or this one
A think that you have to tokenize the expression for correct evaluation but you can detect the pattern using the following regex
r'\s+(\w+)(\s*,\s*\w+)*\s*=\s*(.*?)(\s*,\s*.*?)*'
If group(2) and group(4) are not empty you have to tokenize the expression
Note that if you have
a,b = f(b,a), g(a,b)
It is hard to analyze
Python has an excellent Language Reference that also includes descriptions of the lexical analysis and syntax.
In your case both statements are assignments with a list of targets on the left hand side and and a list of expressions on the right hand side.
But since parts of that grammar part are context-free and not regular, you can’t use regular expressions (unless they support some kind of recursive patterns). So better use a proper parser as Jonas H suggested.
精彩评论