What is the most portable way to read and write the highest bit of an integer in C?
This 开发者_C百科is a Bloomberg interview question. I didn’t give best answer at that time. Can anyone please answer it?
If the type is unsigned, it's easy:
(type)-1-(type)-1/2
For signed values, I know no way. If you find a way, it would answer several unanswered questions on SO:
C question: off_t (and other signed integer types) minimum and maximum values
Is there any way to compute the width of an integer type at compile-time?
Maybe others.
First, note that there's no portable way to access the top bit if we're talking about signed integers; there's simply no single portable representation defined in the standard, so the meaning of 'top bit' can in principle vary. Additionally, C does not allow direct access to the bitwise representation; you can access the int as a char
buffer, but you have no idea where the 'top bit' is located.
If we're only concerned with the non-negative range of a signed integer, and assuming said range has a size that is a power of two (if not, then we need to care about the signed representation again):
#define INT_MAX_BIT (INT_MAX - (INT_MAX >> 1))
#define SET_MAX_BIT(x) (x | INT_MAX_BIT)
#define CLEAR_MAX_BIT(x) (x & ~INT_MAX_BIT)
A similar approach can be used with unsigned ints, where it can be used to get the true top bit.
Here's a silly one, using:
Built-in Function: int __builtin_clz (unsigned int x)
Returns the number of leading 0-bits in x, starting at the most
significant bit position. If x is 0, the result is undefined.
First attempt:
int get_msb(int x) { return x ? __buildin_clz(x) == 0 : 0; }
Note: it's a quirk of C that functions specifying int
or unsigned int
parameters can be called with the other type without warning. But, this probably involves a conversion - the C++ Standard 4.7.2 says:
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [Note: In a two's complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). ]
Which implies that the bit pattern may be changed if it's not a two's complement representation, which would stop this "solution" working reliably too. :-(
Chris's comment below provides a solution (incorporated here as a function rather than preprocessor macro):
int get_msb(int x) { return x ? __buildin_clz(*(unsigned*)&x) == 0 : 0; }
What's wrong with this one?
int get_msb(int n){
return ((unsigned)n) >> (sizeof(unsigned) * CHAR_BIT - 1);
// or, optionally
return n < 0;
};
int set_msb(int n, int msb){
if (msb)
return ((unsigned)n) | (1ULL << (sizeof(unsigned) * CHAR_BIT - 1));
else return ((unsigned)n) & ~(1ULL << (sizeof(unsigned) * CHAR_BIT - 1));
};
It takes care of endianness, number of bits in a byte, and works also on 1's complement.
#define HIGH_BIT(inttype) (((inttype)1) << (CHAR_BIT * sizeof(inttype) - 1))
example usage:
ptrdiff_t i = 4711;
i |= HIGH_BIT(ptrdiff_t); /* set high bit */
i &= ~HIGH_BIT(ptrdiff_t); /* clear high bit */
精彩评论