开发者

Why other languages don't support something similar to preprocessor directives like C and its descendant?

开发者 https://www.devze.com 2023-01-06 20:23 出处:网络
I wonder why other languages do not support this feature. What I can understand that C / C++ code is platform dependent so to make it work (compile and execute) across various platform, is achieved by

I wonder why other languages do not support this feature. What I can understand that C / C++ code is platform dependent so to make it work (compile and execute) across various platform, is achieved by using preprocessor directives. And there are many other uses of this apart from this. Like you can put all your debug printf's inside #if DEBUG ... #endif. So while making the release build these lines of code do not get compiled in the binary.

But in other languages, achieving开发者_如何学运维 this thing (later part) is difficult (or may be impossible, I'm not sure). All code will get compiled in the binary increasing its size. So my question is "why do Java, or other modern compiled languages no support this kind of feature?" which allows you to include or exclude some piece of code from the binary in a much handy way.


The major languages that don't have a preprocessor usually have a different, often cleaner, way to achieve the same effects.

Having a text-preprocessor like cpp is a mixed blessing. Since cpp doesn't actually know C, all it does is transform text into other text. This causes many maintenance problems. Take C++ for example, where many uses of the preprocessor have been explicitly deprecated in favor of better features like:

  • For constants, const instead of #define
  • For small functions, inline instead of #define macros

The C++ FAQ calls macros evil and gives multiple reasons to avoid using them.


The portability benefits of the preprocessor are far outweighed by the possibilities for abuse. Here are some examples from real codes I have seen in industry:

  • A function body becomes so tangled with #ifdef that it is very hard to read the function and figure out what is going on. Remember that the preprocessor works with text not syntax, so you can do things that are wildly ungrammatical

  • Code can become duplicated in different branches of an #ifdef, making it hard to maintain a single point of truth about what's going on.

  • When an application is intended for multiple platforms, it becomes very hard to compile all the code as opposed to whatever code happens to be selected for the developer's platform. You may need to have multiple machines set up. (It is expensive, say, on a BSD system to set up a cross-compilation environment that accurately simulates GNU headers.) In the days when most varieties of Unix were proprietary and vendors had to support them all, this problem was very serious. Today when so many versions of Unix are free, it's less of a problem, although it's still quite challenging to duplicate native Windows headers in a Unix environment.

  • It Some code is protected by so many #ifdefs that you can't figure out what combination of -D options is needed to select the code. The problem is NP-hard, so the best known solutions require trying exponentially many different combinations of definitions. This is of course impractical, so the real consequence is that gradually your system fills with code that hasn't been compiled. This problem kills refactoring, and of course such code is completely immune to your unit tests and your regression tests—unless you set up a huge, multiplatform testing farm, and maybe not even then.

    In the field, I have seen this problem lead to situations where a refactored application is carefully tested and shipped, only to receive immediate bug reports that the application won't even compile on other platforms. If code is hidden by #ifdef and we can't select it, we have no guarantee that it typechecks—or even that it is syntactically correct.

The flip side of the coin is that more advanced languages and programming techniques have reduced the need for conditional compilation in the preprocessor:

  • For some languages, like Java, all the platform-dependent code is in the implementation of the JVM and in the associated libraries. People have gone to huge lengths to make JVMs and libraries that are platform-independent.

  • In many languages, such as Haskell, Lua, Python, Ruby, and many more, the designers have gone to some trouble to reduce the amount of platform-dependent code compared to C.

  • In a modern language, you can put platform-dependent code in a separate compilation unit behind a compiled interface. Many modern compilers have good facilities for inlining functions across interface boundaries, so that you don't pay much (or any) penalty for this kind of abstraction. This wasn't the case for C because (a) there are no separately compiled interfaces; the separate-compilation model assumes #include and the preprocessor; and (b) C compilers came of age on machines with 64K of code space and 64K of data space; a compiler sophisticated enough to inline across module boundaries was almost unthinkable. Today such compilers are routine. Some advanced compilers inline and specialize methods dynamically.

Summary: by using linguistic mechanisms, rather than textual replacement, to isolate platform-dependent code, you expose all your code to the compiler, everything gets type-checked at least, and you have a chance of doing things like static analysis to ensure suitable test coverage. You also rule out a whole bunch of coding practices that lead to unreadable code.


Because modern compilers are smart enough to remove dead code in most any case, making manually feeding the compiler this way no longer necessary. I.e. instead of :

#include <iostream>

#define DEBUG

int main()
{
#ifdef DEBUG
        std::cout << "Debugging...";
#else
        std::cout << "Not debugging.";
#endif
}

you can do:

#include <iostream>

const bool debugging = true;

int main()
{
    if (debugging)
    {
        std::cout << "Debugging...";
    }
    else
    {
        std::cout << "Not debugging.";
    }
}

and you'll probably get the same, or at least similar, code output.


Edit/Note: In C and C++, I'd absolutely never do this -- I'd use the preprocessor, if nothing else that it makes it instantly clear to the reader of my code that a chunk of it isn't supposed to be complied under certain conditions. I am saying, however, that this is why many languages eschew the preprocessor.


A better question to ask is why did C resort to using a pre-processor to implement these sorts of meta-programming tasks? It isn't a feature as much as it is a compromise to the technology of the time.

The pre-processor directives in C were developed at a time when machine resources (CPU speed, RAM) were scarce (and expensive). The pre-processor provided a way to implement these features on slow machines with limited memory. For example, the first machine I ever owned had 56KB of RAM and a 2Mhz CPU. It still had a full K&R C compiler available, which pushed the system's resources to the limit, but was workable.

More modern languages take advantage of today's more powerful machines to provide better ways of handling the sorts of meta-programming tasks that the pre-processor used to deal with.


Other languages do support this feature, by using a generic preprocessor such as m4.

Do we really want every language to have its own text-substitution-before-execution implementation?


The C pre-processor can be run on any text file, it need not be C.

Of course, if run on another language, it might tokenize in weird ways, but for simple block structures like #ifdef DEBUG, you can put that in any language, run the C pre-processor on it, then run your language specific compiler on it, and it will work.


Note that macros/preprocessing/conditionals/etc are usually considered a compiler/interpreter feature, as opposed to a language feature, because they are usually completely independent of the formal language definition, and might vary from compiler to compiler implementation for the same language.

A situation in many languages where conditional compilation directives can be better than if-then-else runtime code is when compile-time statements (such as variable declarations) need to be conditional. For example

$if debug
array x
$endif
...
$if debug
dump x
$endif

only declares/allocates/compiles x when needing x, whereas

array x
boolean debug
...
if debug then dump x

probably has to declare x regardless of whether debug is true.


Many modern languages actually have syntactic metaprogramming capabilities that go way beyond CPP. Pretty much all modern Lisps (Arc, Clojure, Common Lisp, Scheme, newLISP, Qi, PLOT, MISC, ...) for example have extremely powerful (Turing-complete, actually) macro systems, so why should they limit themselves to the crappy CPP style macros which aren't even real macros, just text snippets?

Other languages with powerful syntactic metaprogramming include Io, Ioke, Perl 6, OMeta, Converge.


Because decreasing the size of the binary:

  1. Can be done in other ways (compare the average size of a C++ executable to a C# executable, for example).
  2. Is not that important, when it you weigh it against being able to write programs that actually work.


Other languages also have better dynamic binding. For example, we have some code that we cannot ship to some customers for export reasons. Our "C" libraries use #ifdef statements and elaborate Makefile tricks (which is pretty much the same).

The Java code uses plugins (ala Eclipse), so that we just don't ship that code.

You can do the same thing in C through the use of shared libraries... but the preprocessor is a lot simpler.


A other point nobody else mentioned is platform support.

Most modern languages can not run on the same platforms as C or C++ can and are not intended to run on this platforms. For example, Java, Python and also native compiled languages like C# need a heap, they are designed to run on a OS with memory management, libraries and large amount of space, they do not run in a freestanding environment. There you can use other ways to archive the same. C can be used to program controllers with 2KiB ROM, there you need a preprocessor for most applications.

0

精彩评论

暂无评论...
验证码 换一张
取 消