开发者

IR tree representation of binary operation

开发者 https://www.devze.com 2022-12-24 23:40 出处:网络
I have a simple operation like so: k + 1 What would be a IR tree representation of it? So far I came up with:

I have a simple operation like so:

k + 1

What would be a IR tree representation of it?

So far I came up with:

BINOP(+, MEM(NAME(k)), CONST(1))

but I am not sure if I need the MEM for NAME or not. Any help welcome.

The notation we use is exactly the same as in this pdf: http://www.computing.dcu.ie/~hamilton/teaching/CA449/notes/translate.pdf

It is from modern compiler implementation in java book.开发者_高级运维


Well, it really depends on how your IR is represented. Also, as pointed out by mrjoltcola, what you have suggested looks more like an AST (abstract syntax tree). An AST is more high level, and is generally a tree structure generated directly from the source. An IR is generally much simpler and more low-level (and code is usually represented as a list of simple instructions, not a tree of expressions). ASTs are generally highly specific to the compiler. An IR is often compiler and/or language independent (see for example LLVM).

If you're are going to represent it in text, I think that

+(k,1)

would be simpler, because you're going to need a fairly complex lexer/parser anyways, and this would be a bit more readable for humans (although a bit less readable for computers). There is no need to have things like MEM or NAME, because an identifier in that case would obviously be a variable access (it does depend on the language and the structure of your compiler though). I definitely wouldn't use something like BINOP though, because it really only complicates the code (you still have to handle addition separately from the other binary operations).

If you are just keeping it in memory, it depends on the language. In Haskell, I would do something like this:

data Expr = Const Int | Variable String | Add Expr Expr | ...

and then your example would be:

Add (Variable "k") (Const 1)

In C++/C#/Java, you would probably use classes to emulate the algebraic data types.

For example, in rough C++-ish pseudocode:

class Expr {};
class Const : Expr {int v; Const(int v) : v(v) {}};
class Variable : Expr {string n; Variable(string n) : n(n) {}};
class Add : Expr {Expr a, b; Add(Expr a, Expr b) : a(a), b(b) {}};

...

Add(Variable("k"), Const(1))


The only thing I can do is read the author's (Appel) rules of his IR tree language.

For a grammar:

  <ADDEXPR> ::= <EXPR> + <EXPR>

  <EXPR> ::= IDENTIFIER
           | LITERAL

The AST tree might be:

  BINOPEXPR(+, EXPR(IDENTIFIER(k)), EXPR(LITERAL(1)))

Or might include IdentifierExpression and LiteralExpression, so it might be:

  BINOPEXPR(+, IDENT_EXPR(k), LITERAL_EXPR(1))

But AST and IR tree are different things. So according to Appel's book, in section 7.1 of the C version:

NAME(n) = Sumbolic constant n corresponding to an assembly language label.

MEM(e) = Contents of wordSize bytes of memory starting at address e. When MEM() isused as the left child of a MOVE(), it means "store", but anywhere else it means "fetch".

LABEL(n) = Define constant value of name n to be the current machine code address. This is like a label definition in assembly language. The value NAME(n) may be the target of jumps, calls, etc.

Given his rules, NAME() is not what you want there for a variable k, name is used for code labels.

My guess based on reading him is if k is a variable, and in this case it is an r-value, then you use simply MEM(e) but e could be either a local variable (in the local stack frame), or a global variable, or even a temporary. So the translation for "k + 1" will depend on where "k" is allocated. If its in the local stack frame, then k is:

MEM(BINOP(+, TEMP fp, CONST (address of k)))

So k + 1 would be:

BINOP(+, MEM(BINOP(+, TEMP fp, CONST (address of k))), 1)

So you need to be clear on how you are allocating storage for your variables. That will be in the symbol table for the identifier.

I've got at least 20 compiler books and honestly, I find Appel's book confusing and too brief in areas. He makes conceptual leaps that a student does not follow, and could use a bit more elaboration in places. Since I've never actually implemented IR trees, (I always write to a textual Intermediate Language with assembler support for both declaring variables of different scopes, and symbolic temporaries) I cannot say for sure what he means, so I recommend confirming with your professor as he has probably taught with the book before and may know it better.

Hope that helps.

0

精彩评论

暂无评论...
验证码 换一张
取 消