开发者

Generating intermediate code in a compiler. Is an AST or parse tree always necessary when dealing with conditionals?

开发者 https://www.devze.com 2023-02-17 23:41 出处:网络
I\'m taking a compiler-design class where we have to implement our own compiler (using flex and bison). I have had experience in parsing (writing EBNF\'s and recursive-descent parsers), but this is my

I'm taking a compiler-design class where we have to implement our own compiler (using flex and bison). I have had experience in parsing (writing EBNF's and recursive-descent parsers), but this is my first time writing a compiler.

The language design is pretty open-ended (the professor has left it up to us). In class, the professor went over generating intermediate code. He said that it is not necessary for 开发者_运维问答us to construct an Abstract Syntax Tree or a parse tree while parsing, and that we can generate the intermediate code as we go.

I found this confusing for two reasons:

  • What if you are calling a function before it is defined? How can you resolve the branch target? I guess you would have to make it a rule that you have to define functions before you use them, or maybe pre-define them (like C does?)

  • How would you deal with conditionals? If you have an if-else or even just an if, how can you resolve the branch target for the if when the condition is false (if you're generating code as you go)?

I planned on generating an AST and then walking the tree after I create it, to resolve the addresses of functions and branch targets. Is this correct or am I missing something?


The general solution to both of your issues is to keep a list of addresses that need to be "patched." You generate the code and leave holes for the missing addresses or offsets. At the end of the compilation unit, you go through the list of holes and fill them in.

In FORTH the "list" of patches is kept on the control stack and is unwound as each control structure terminates. See FORTH Dimensions

Anecdote: an early Lisp compiler (I believe it was Lisp) generated a list of machine code instructions in symbolic format with forward references to the list of machine code for each branch of a conditional. Then it generated the binary code walking the list backwards. This way the code location for all forward branches was known when the branch instruction needed to be emitted.


The Crenshaw tutorial is a concrete example of not using an AST of any kind. It builds a working compiler (including conditionals, obviously) with immediate code generation targeting m68k assembly.

You can read through the document in an afternoon, and it is worth it.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号