开发者

Need some clarification regarding casting in C

开发者 https://www.devze.com 2022-12-14 09:23 出处:网络
I was just reading about the bad practice of casting the return value of malloc. If I understood correctly, it is absolutely legal to leave the cast as it is done implicitly (and should be left, becau

I was just reading about the bad practice of casting the return value of malloc. If I understood correctly, it is absolutely legal to leave the cast as it is done implicitly (and should be left, because of other problems it could generate). Well my question is, when should I then cast my values ? Is there some general rule or something ? For example, this code compiles without any errors with gcc -W -Wall (except unused bar, but that's not the point):

float foo(void) {
    double bar = 4.2;
    return 开发者_开发百科bar;
}

int main(void) {
    double bar = foo();
    return 0;
}

I'm confused now. What are the good practices and the rules about casting ?

Thanks.


There are several situations that require perfectly valid casting in C. Beware of sweeping assertions like "casting is always bad design", since they are obviously and patently bogus.

One huge group of situations that critically relies on casts is arithmetic operations. The casting is required in situations when you need to force the compiler to interpret arithmetic expression within a type different from the "default" one. As in

unsigned i = ...;
unsigned long s = (unsigned long) i * i;

to avoid overflow. Or in

double d = (double) i / 5;

in order to make the compiler to switch to floating-point division. Or in

s = (unsigned) d * 3 + i;

in order to take the whole part of the floating point value. And so on (the examples are endless).

Another group of valid uses is idioms, i.e. well-established coding practices. For example, the classic C idiom when a function takes a const pointer as an input and returns a non-const pointer to the same (potentially constant) data, like the standard strstr for example. Implementing this idiom usually requires a use of a cast in order to cast away the constness of the input. Someone might call it bad design, but in reality there's no better design alternative in C. Otherwise, it wouldn't be a well-established idiom :)

Also it is worth mentioning, as an example, that a pedantically correct use of standard printf function might require casts on the arguments in general case. (Like %p format specifier expecting a void * pointer as an argument, which means that an int * argument has to be transformed into a void * in one way or another. An explicit cast is the most logical way to perform the transformation.).

Of course, there are other numerous examples of perfectly valid situations when casts are required.

The problems with casts usually arise when people use them thoughtlessly, even where they are not required (like casting the return of malloc, which is bad for more reasons than one). Or when people use casts to force the compiler to accept their bad code. Needless to say, it takes certain level of expertise to tell a valid cast situation from a bad cast one.

In some cases casts are used to make the compiler to stop issuing some annoying and unnecessary warning messages. These casts belong to the gray area between the good and the bad casts. On the one hand, unnecessary casts are bad. On the other hand, the user might not have control over the compilation settings, thus making the casts the only way to deal with the warnings.


If and when you need to cast I always suggest you do this explicitly to show others, or perhaps yourself in the future, that you intended for this behavior.

By the way, the gcc warning for this is -Wconversion. Unfortunately -Wall and -Wextra still leave alot of good warnings off.

Here are the flags I use when I want gcc to be very lint-like

-pedantic -std=c99 -ggdb3 -O0 -Wall -Wextra -Wformat=2 -Wmissing-include-dirs -Winit-self -Wswitch-default -Wswitch-enum -Wunused-parameter -Wfloat-equal -Wundef -Wshadow -Wlarger-than-1000 -Wunsafe-loop-optimizations -Wbad-function-cast -Wcast-qual -Wcast-align -Wconversion -Wlogical-op -Waggregate-return -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes -Wmissing-declarations -Wpacked -Wpadded -Wredundant-decls -Wnested-externs -Wunreachable-code -Winline -Winvalid-pch -Wvolatile-register-var -Wstrict-aliasing=2 -Wstrict-overflow=2 -Wtraditional-conversion -Wwrite-strings

I also check my code first with cppcheck which is a free static code analyzer for C and C++. Highly recommended.


My simple guideline - if it needs a cast, it's probably wrong. If you don't need a cast, don't use them.


The reason you don't cast the return value of malloc is because you are always assigning the return value to a pointer type, and the C standard allows a void * to be implicitly cast to any other pointer type. Explicitly casting it is redundant, and therefore unnecessary.


You can't ask about casting in 'C' w/o understanding that casting covers more than one type of operation. There are essentially two, type conversion and type coercion. In C++, because it has more type info, it's creating 4 types of casts and codified this with an exclusive notation. reinterpret_cast<>, const_cast<>, dynamic_cast<> and static_cast<>. You don't have these in C since all casts have the syntax (ctype) but the reasons for them remain and it helps to understand why casting is required even though your question was about 'C' specifically.

The "need" for static cast is what you show in your example. The compiler will do them for you, even if you don't specify it - however, crank the warning level up high enough, and the compiler will warn you if there is a loss of precision as there is going from double to float (your return bar; statement). Adding a cast tells the compiler the loss of precision was intended.

The second least dangerous cast is a const cast<>. It's used to removed const or volatile from a type. This commonly occurs where structures have internal "caches". So a caller may have a const version of your structure, but an "internal function" needs to update the cache so will have to cast from a pointer to a const struct to a regular struct to update an internal field.

The most dangerous type is a reinterpret cast and why people will go on and on about how bad it is to cast. That's where you're not converting anything, but telling the compiler to reinterpret a value as a totally different type. What is below might have been added by a naive programmer trying to get rid of a compiler error.

    char **ptostr = (char **p) "this is not a good idea"; 

Likely the correct fix was to use an '&' and this is how casting gets a bad reputation. Casts like this can be used for good or evil. I used it in the answer to another question about how to find the smallest power of 2 in order to leverage the power of the FPU in a CPU. A better example of being used for good, is when implementing linked lists. If the links live in the objects themselves, you have to cast from the link pointer back out to the enclosed object (a good use for the offsetof macro if the links can't be at the top of the structure).

A dynamic cast has no language support in C but the circumstance still occurs. If you have a heterogeneous list, then you might verify an object was of a given type using a field in the list's link header. Implemented manually, you would verify the type as being compatible and return NULL if it wasn't. This is special version of the reinterpret cast.

There are many sophisticated programing patterns that require casting so I wouldn't say casting needs to be avoided or indicates something is wrong. The problem with 'C' is how you write the unsafe ones in the same exact way as the safe ones. Keeping it contained and limited is a good practice so you can make sure you have it right (e.g., use library routines, strong typing and asserts if you can).


Casting a result should only done when strictly necessary; if you are using code developed from two different people (such as static libraries, or dynamic libraries), and two functions don't use compatible values, then casting is the only solution (as long as you don't try to cast a string to an integer).

Before to use the casting, it would be better to verify if the datatypes used are correct. In the example code (which has the purpose of providing an example), it doesn't make sense to declare the returned value to be a float value when the function returns a double.


double bar = foo();

What happens here is called promotional conversion, where the value of the casted variable is reserved after the conversion. The reverse is not true, i.e. float -> double. The only answer is to cast only when you really need to. Casting a lot is a sign of bad design.


In your example there's a loss of precision, but the cast is implicit. There are times when casting is absolutely necessary, such as when you're reading data from a byte stream or when all you have is data coming in through a void* pointer, but you know what data it represents. But for most part, casting should be avoided and reserved for these extreme cases.


What you're looking at is implicit type conversion. This is considered safe if you're starting with a type having a more restricted range than the one you're ending up with, i.e. short to int is OK, as is float to double.

I'm quite surprised that gcc isn't generating a warning when converting a double to a float; I believe Microsoft's compiler does.


You might find these two SO posts informative:

Specifically, what’s dangerous about casting the result of malloc?

Do Implict Function Declarations in C Actually Generate Object Code?


The admonition against casting the result of malloc() is a special case, due to C implicitly typing the result of previously undeclared functions to int (which IINM is disallowed as of C99).

Generally, you want to limit the use of explicit casts as much as possible; the only time you need to use one is if you're trying to assign a value of one type to a variable of an incompatible type (e.g., assign a pointer value to an int variable or vice versa). Since void * is compatible with every other pointer type, no explicit cast is needed. However, if you're trying to assign a value of type int * to a variable of type struct foo *, an explicit cast is required. If you find yourself assigning values of incompatible types a lot, then you may want to revisit your design.


Basically you need to cast arguments to functions that expect a different parameter than their prototype claims.

For example, isalpha() has a prototype with an int argument, but really expects an unsigned char.

char *p;
if ((*p != EOF) && isalpha((unsigned char)*p) /* cast needed */
{
    /* ... */
}

And, you need to be extra careful with functions that accept a variable number of arguments, eg:

long big;
printf("%d\n", (int)big);

Edit

The compiler cannot convert the arguments of variadic functions to the proper type, because there is no type information available in the prototype itself. Consider a printf()-like function

int my_printf(const char *fmt, ...);

As far as the compiler is aware, you can pass values of all kinds of types in the "..." argument, and you must make sure the arguments match what the function expects. For example, let's say the my_printf() function accepts a value of type time_t with a corresponding "%t" specifier in the format string.

my_printf("UNIX Epoch: %t.\n", 0);         /* here, 0 is an int value */
my_printf("UNIX Epoch: %t.\n", (time_t)0); /* and here it is a time_t */

My compiler does not want to make this fail! Apparently it (and the one at codepad too) passes 8 bytes for each argument in "..."

Using a prototype without "..." ( int my_printf(const char *fmt, time_t data); ), the compiler would automagically convert the first 0 to the right type.

Note: some compilers (gcc included) will validate the arguments against the format string for printf() if the format string is a literal string


There are at least two kinds of casts:

  • Casts of numeric values that cause a change of representation. (This is a term of art you will find in Harbison and Steele's very helpful C Reference Manual.) These casts are mostly innocuous; the only way you can do harm is by, e.g., casting a wider type to a narrower type, in which case you are deliberately throwing away bits. Only you, the programmer, know whether it's safe to throw away those bits.

    C also has an insidious feature that it may perform a change of representation on your behalf when you assign, return, or pass a numeric expression whose type doesn't exactly match the type of the associate lvalue, result, or argument. I think this feature is pernicious, but it is firmly embedded in the C way of doing things. The gcc option -Wconversion will let you know where the compiler is doing a change of representation on your behalf, without being asked.

  • Casts that don't involve a change of representation but simply ask the compiler to view bits a certain way. These include casts between signed and unsigned types of the same size, as well as casts between pointer types that point to different sorts of data. The type void * enjoys a special status in C as the compiler will convert freely between void * and other data pointer types without a cast. Note that a cast between pointers of two different types of data never involves a change of representation. This is one reason the void * convention can work.

One reason C programmers try to avoid gratuitous casts is that when you write a cast, the compiler trusts you completely. If you make a mistake, the compiler is not going to catch it for you. For this reason I advise my student never to cast pointer types unless they know exactly what they are doing.


P.S. I thought it would be possible to write a short, helpful answer to this question. Wrong!

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号