I have just started self-studying the Dragon book of Compiler Design. I am working on a problem that says to design grammar for an expression containin开发者_JAVA技巧g binary +,-,*,/ and unary +,-
I came up with following
E -> E+T | E-T | T
T -> T*P | T/P | P
P -> +S | -S | S
S -> id | constant | (E)
However, there is an obvious flaw in it. According to this grammar, expressions like
1--3
are valid, which is an error in all programming languages I know. Though, expressions like
1+-+3
and
1- -3
must be valid. How can such a grammar be designed?
I believe your problem is with tokenization. You are identifying 1--3
as an error because you think it should be resolved as 1 --3
rather than 1 - -3
, the latter being perfectly valid. So I think you problem comes because when you tokenize the string you are getting:
['1', '-', '-' , '3']
rather than:
['1', '--', '3']
I think you have one extra production rule
P -> +S | -S | S
S -> id | constant | (E)
can be shrinked to
P -> +P | -P | id | constant | (E)
With such grammar you will succesfully match exp "1+-+3" as valid.
You have tokenizer(scanner)
problem! before passing token to parser You must differentiate between "-" and "--". you must define a token struct which contains a token type and value, then parse the token list.
Also the rule P->--S
must be added to the production rules!!
精彩评论