tail call generated by clang 1.1 and 1.0 (llvm 2.7 and 2.6)_问答_开发者

tail call generated by clang 1.1 and 1.0 (llvm 2.7 and 2.6)

开发者 https://www.devze.com 2023-01-01 00:29 出处：网络

After compilation next snippet of code with clang -O2 (or with online demo): #include <stdio.h> #include <stdlib.h>

相关专题：clang llvm

After compilation next snippet of code with clang -O2 (or with online demo):

#include <stdio.h>
#include <stdlib.h>

int flop(int x);
int flip(int x) {开发者_如何转开发
  if (x == 0) return 1;
  return (x+1)*flop(x-1);
}
int flop(int x) {
  if (x == 0) return 1;
  return (x+0)*flip(x-1);
}

int main(int argc, char **argv) {
  printf("%d\n", flip(atoi(argv[1])));
}

I'm getting next snippet of llvm assembly in flip:

bb1.i:                                            ; preds = %bb1
  %4 = add nsw i32 %x, -2                         ; <i32> [#uses=1]
  %5 = tail call i32 @flip(i32 %4) nounwind       ; <i32> [#uses=1]
  %6 = mul nsw i32 %5, %2                         ; <i32> [#uses=1]
  br label %flop.exit

I thought that tail call means dropping current stack (i.e. return will be to the upper frame, so next instruction should be ret %5), but according to this code it will do mul for it. And in native assembly there is simple call without tail optimisation (even with appropriate flag for llc)

Can sombody explain why clang generates such code?

As well I can't understand why llvm have tail call if it can simply check that next ret will use result of prev call and later do appropriate optimisation or generate native equivalent of tail-call instruction?

Take a look at the 'call' instruction in the LLVM Assembly Language Reference Manual. It says:

The optional "tail" marker indicates that the callee function does not access any allocas or varargs in the caller. Note that calls may be marked "tail" even if they do not occur before a ret instruction.

It's likely that one of the LLVM optimization passes in Clang analyzes whether or not the callee accesses any allocas or varargs in the caller. If it doesn't, the pass marks the call as a tail call and lets another part of the LLVM figure out what to do with the "tail" marker. Maybe the function can't be a real tail call right now, but after further transformations it could be. I'm guessing it's done this way to make the ordering of the passes less important.