How does the compiler know which functionality is associated with which operator?
This is a bit of an open-ended question. Ask further and you are asking us to write a book. Multiple books, in fact:
- A book on modern algebra, which explains what numbers are and the workings of those mathematical operations that we simply take for granted;
- Multiple books from computer science such as structures and algorithms, assembly languages, and compiler theory, which describe how high-level languages are translated into instructions that a CPU can execute;
- Multiple books from computer engineering that describe CPUs, arithmetic logic units, floating point processors, gates, and other computer innards that computer scientists simply take for granted.
So, a short and perhaps a bit unsatisfactory answer:
Operators have a fixed meaning in C. That meaning is specified to some extent in the C ISO/IEC standard. The standard does not specify that 1+1=2, etc. That is knowledge you learned in high school and before in lower level algebra classes. The theory behind that stuff we take for granted is something you learn in a modern algebra class (typically taken by math students after plowing through multiple calculus classes).
Operators have a more flexible meaning in C++. Operators can be overloaded in C++. Sans this overloading, the meaning of operators in C++ is pretty much the same as in C. However, C++ gives the programmer the ability to define what foo+bar
means in the case that foo
and bar
are instances of some user-defined type. The standard specifies the signatures of these overloaded operators and specifies how the implementation is to apply these overloads. How exactly foo
and bar
are added to one another: That is up to the implementor of the overloaded operator.
The compiler knows the functionality for each built-in operator because the programmers who wrote the compiler made it know the functionality. A compiler is a program just like any other, and it can be made to do whatever its designers want.
The compiler knows the behavior of user-defined (overloaded) operators because it has already seen declarations for those operators, so when it sees them used elsewhere in the program, it checks its list of available overloads and chooses the one that best fits the combination of operands being used. The standard defines how to decide what the "best" combination is based on the types, the current scope, and whatever type conversions are defined. If there is no best fit, then you get a compilation error.
To answer the question in the title, the operators are "defined" in the source code which makes up the compiler. The compiler is just a program, like any other. It reads in C or C++ code, and parses it according to a grammar. The input is then converted to a machine language. The CPU ultimately "defines" the operations of a machine language in its hardware. For example, "+" might trigger the contents of various registers to be fed through a ripple-carry adder.
Other languages implement the operators in more or less the same way.
Bjarne Stroustrup's book (The C++ Programming Language) shows how to implement a simple calculator language. That calculator can calculate simple sums like 5 + 3 *16
. Now, a compiler has to deal with more complicated expressions, with variables and function calls, e.g. a + 5 * foo()
, but the idea remains the same. When you see +
, you first look at the number on left side, then at the right side, and then you add these numbers, because that's what +
means.
For C: Dennis Ritchie decided what the operators would do back in the early 1970's, and it is the job of each compiler's authors to implement those decisions by generating conforming code based on the operand types.
For C++: As for C, but also the existing operators can be extended to new types by library and application code. Needless to say, at some point the operation must be implemented with core functionality or in assembly language.
精彩评论