开发者

Pointer syntax in C: why does * only apply to the first variable?

开发者 https://www.devze.com 2023-01-07 18:45 出处:网络
The following declaration in C: int* a, b; will declare a as type int* and b as type int. I\'m well aware of this trap, but what I want to know is why it works this way. Why doesn\'t it also declar

The following declaration in C:

int* a, b;

will declare a as type int* and b as type int. I'm well aware of this trap, but what I want to know is why it works this way. Why doesn't it also declare b as int*, as most people would intuitively expect? In other words, why does * apply to the variable name, rather than the type?

Sure you could write it this way to be more consistent with how it actually works:

int *a, b;

However, I and everyone I've spoken to think in terms of a is of type "pointer to int", rather than a is a pointer to some data and the type of that data is "int".

Was this simply a bad decision by the designers of C or is there some good reason why it's parsed this way? I'm sure the question has 开发者_运维问答been answered before, but I can't seem to find it using the search.


C declarations were written this way so that "declaration mirrors use". This is why you declare arrays like this:

int a[10];

Were you to instead have the rule you propose, where it is always

type identifier, identifier, identifier, ... ;

...then arrays would logically have to be declared like this:

int[10] a;

which is fine, but doesn't mirror how you use a. Note that this holds for functions, too - we declare functions like this:

void foo(int a, char *b);

rather than

void(int a, char* b) foo;

In general, the "declaration mirrors use" rule means that you only have to remember one set of associativity rules, which apply to both operators like *, [] and () when you're using the value, and the corresponding tokens in declarators like *, [] and ().


After some further thought, I think it's also worth pointing out that spelling "pointer to int" as "int*" is only a consequence of "declaration mirrors use" anyway. If you were going to use another style of declaration, it would probably make more sense to spell "pointer to int" as "&int", or something completely different like "@int".


There's a web page on The Development of the C Language that says, "The syntax of these declarations reflects the observation that i, *pi, and **ppi all yield an int type when used in an expression." Search for that sentence on the page to find the relevant section that talks about this question.


There may be an additional historical reason, but I've always understood it this way:

One declaration, one type.

If a, b, c, and d must be the same type here:

int a, b, c, d;

Then everything on the line must an integer as well.

int a, *b, **c, ***d;

The 4 integers:

  1. a
  2. *b
  3. **c
  4. ***d

It may be related to operator precedence, as well, or it may have been at some point in the past.


I assume it is related to the full declaration syntax for type modifiers:

int x[20], y;
int (*fp)(), z;

In these examples, it feels much more obvious that the modifiers are only affecting one of the declarations. One guess is that once K&R decided to design modifiers this way, it felt "correct" to have modifiers only affect one declaration.

On a side note, I would recommend just limiting yourself to one variable per declaration:

int *x;
int y;


The * modifies the variable name, not the type specifier. This is mostly because of the way the * is parsed. Take these statements:

char*  x;
char  *x;

Those statements are equivalent. The * operator needs to be between the type specifier and the variable name (it is treated like an infix operator), but it can go on either side of the space. Given this, the declaration

int*  a, b;

would not make b a pointer, because there is no * adjacent to it. The * only operates on the objects on either side of it.

Also, think about it this way: when you write the declaration int x;, you are indicating that x is an integer. If y is a pointer to an integer, then *y is an integer. When you write int *y;, you are indicating that *y is an integer (which is what you want). In the statement char a, *b, ***c;, you are indicating that the variable a, the dereferenced value of b, and the triply-dereferenced value of c are all of type char. Declaring variables in this way makes the usage of the star operator (nearly) consistent with dereferencing.

I agree that it would almost make more sense for it to be the other way around. To avoid this trap, I made myself a rule always to declare pointers on a line by themselves.


Consider the declaration:

int *a[10];
int (*b)[10];

The first is an array of ten pointers to integers, the second is a pointer to an array of ten integers.

Now, if the * was attached to the type declaration, it wouldn't be syntatically valid to put a parenthesis between them. So you'd have to find another way to differentiate between the two forms.


Because if the statement

int* a, b;

were to declare b as a pointer too, then you would have no way to declare

int* a;
int  b;

on a single line.

On the other hand, you can do

int*a, *b;

to get what you want.

Think about it like that: the way it is now it is still the most concise and yet unique way to do it. That's what C is mostly about :)

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号