开发者

Programming a simple compiler

开发者 https://www.devze.com 2023-02-19 20:44 出处:网络
I am writing a compiler for a simple language. I made a lexer/tokenizer that takes a file and prints the tokens in stdout.

I am writing a compiler for a simple language.

I made a lexer/tokenizer that takes a file and prints the tokens in stdout.

Now I want to to make the syntactical analysis, but I开发者_运维知识库 don't know how to modify my lexer in order to take the tokens as input.

  • A linked list is extremely inefficient for large files (source files around 80MB take about 1.3GB of ram)
  • I could modify my lexer to give the next token every time it is called (idea taken from the Dragon Book), but I don't know what I will do if somewhere in the process I have to go back and read a previous token.

What is the right way to do these things?


Implementing a nextToken() method in the lexical analyser is the standard way. This method is called by the parser (or syntax analyser) until the entire input has been consumed.

but I dont what I will do if somewhere in the process i have to go back and read a previous token

This is not usually the case. But, what the parser may need to do is 'push back' a token (or a number of tokens depending on the lookahead of the parser) which has already been seen. In this case the lexer provides a pushBack(Token) which ensures that the next call to nextToken() will return the supplied token, rather than the next token appearing in the input.


but I dont what I will do if somewhere in the process i have to go back and read a previous token

It sounds like your matches are too greedy.

You might look into Backtracking

0

精彩评论

暂无评论...
验证码 换一张
取 消