Say I have a poor language - that creates lot of redundant assembly code ( like say for something as simple as a+b it creates a 10 liner assembly but does the job ) and another C like language which creates nice optimized assembly code ( 2 lines for a simple code like a + b) . I create a compiler in both these languages with the same subset Now I get a compiler for my language and am ready for bootstrapping Don't you think the compiler of the former case would be a bad choice though it can do boo开发者_JS百科tstrapping albeit a poort code generation ? Alternatively isn't the first language used for defining the subset going to have its influence in all the layers about ? I.e a compiler generated from this compiler ?
A freshly boot strapped compiler won't beat a C compiler, that's for sure. But nobody says it has to stay that way (well, you'll have a very hard time beating even modern C compilers, but let's assume we're competing with something else than compilers for portable assembly language refined over thirty years). Depending on your language, it's very possible that extending the bootstrapped compiler is much easier than extending the one written in C. This can allow many optimization passes that would have been harder to implement in C, gradually increasing the performance of your compiler (as it self-hosts, i.e. compiles itself) and the performance of all other programs you compile.
That distinction brings us to another important point: The compiler's performance is rarely relevant, as long as it's not completely unreasonable. The performance of the generated code is usually much more important, and that depends on your compiler's code generator, not on the code generator of the compiler used to compile your generator.
Third, with projects such as LLVM, generating decent assembly code is no longer as hard as it used to be. If you generate okay LLVM code, even if it contains redundancy, LLVM has many optimization passes that can take care of that and will produce better actual assembly code and allocate registers better than you could do by yourself within a reasonable timespan.
How does the output of your compiler, once bootstrapped, depend on the bootstrapping language? If you're doing your own code generation, the answer should be not at all. Sure, the compiler might not have optimal code, but you can solve that by compiling it again with itself.
About the only thing I can think of that's passed from one generation of a bootstrapped compiler to the next is the value of constants, e.g. "1" will be equal to "1" no matter what kind of code the compiler generates.
An interesting side note about constants: C's escape characters. I've written C compilers using bootstrapping and have been amused (I'm easily amused) when I see code like:
// Decode escape characters.
if (ch == '\\') {
ch = nextchar();
switch (ch) {
case 'n':
ch = '\n';
break;
...
Where did the initial value of '\n' come from? Somebody, somewhere, had to tell some compiler that '\n' has the value 10. ;-)
精彩评论