In learning how floating point numbers are represented in computers I have come across the term "bias value" that I do not quite understand.
The bias value in floating point numbers has to do with the negative and positiven开发者_StackOverflowess of the exponent part of a floating point number.
The bias value of a floating point number is 127, which means that 127 is always added to the exponent part of a floating point number. How does doing this help determine if the exponent is negative or positive or not?
b0lt has already explained how bias works. At a guess, perhaps you'd like to know why they use a bias representation here, even though virtually all modern computers use two's complement essentially everywhere else (and even machines that don't use two's complement, use one's complement or sign-magnitude, not bias).
One of the goals of the IEEE floating point standards was that you could treat the bits of a floating point number as a (signed) integer of the same size, and if you compared them that way, the values will sort into the same order as the floating point numbers they represented.
If you used a twos-complement representation for the exponent, a small positive number (i.e., with a negative exponent) would look like a very large integer because the second MSB would be set. By using a bias representation instead, you don't run into that -- a smaller exponent in the floating point number always looks like a smaller integer.
FWIW, this is also why floating point numbers are typically arranged with the sign first, then the exponent, and finally the significand in the least significant bits--this way, you can take positive floating point numbers, treat those bits as integers, and sort them. When you do so, the result will have the floating point numbers in the correct order. For example:
#include <vector>
#include <algorithm>
#include <iostream>
int main() {
// some arbitrary floating point values
std::vector<double> vals = { 1e21, 1, 2.2, 2, 123, 1.1, 0.0001, 3, 17 };
std::vector<long long> ivals;
// Take those floating point values, and treat the bits as integers:
for (auto &&v : vals)
ivals.push_back(*reinterpret_cast<long long *>(&v));
// Sort them as integers:
std::sort(ivals.begin(), ivals.end());
// Print out both the integers and the floating point value those bits represent:
for (auto &&i : ivals)
std::cout << i << "\t(" << *reinterpret_cast<double *>(&i) << ")\n";
}
When we run this, the result looks like this:
4547007122018943789 (0.0001)
4607182418800017408 (1)
4607632778762754458 (1.1)
4611686018427387904 (2)
4612136378390124954 (2.2)
4613937818241073152 (3)
4625478292286210048 (17)
4638355772470722560 (123)
4921056587992461136 (1e+21)
As you can see, even though we sorted them as integers, the floating point numbers that those bits represent also come out in the correct order.
This does have limitations with respect to floating point numbers. While all (non-ancient) computers agree on the representation of positive numbers, there are three representations that have (fairly recently) been used for signed numbers: signed magnitude, one's complement, and two's complement.
Just treating the bits as an integer and comparing will work fine on a computer that uses signed magnitude representation for integers. For computers that use one's complement or two's complement, negative numbers will sort in inverted order. Since this is still a simple rule, it's pretty easy to write code that works with it. If we change the sort
call above to something like this:
std::sort(ivals.begin(), ivals.end(),
[](auto a, auto b) { if (a < 0.0 && b < 0.0) return b < a; return a < b; }
);
...it will then correctly sort both positive and negative numbers. E.g., input of:
std::vector<double> vals = { 1e21, 1, 2.2, 2, 123, 1.1, 0.0001, 3, 17, -0.001, -0.00101, -1e22 };
Will produce a result of:
-4287162073302051438 (-1e+22)
-4661071411077222194 (-0.00101)
-4661117527937406468 (-0.001)
4547007122018943789 (0.0001)
4607182418800017408 (1)
4607632778762754458 (1.1)
4611686018427387904 (2)
4612136378390124954 (2.2)
4613937818241073152 (3)
4625478292286210048 (17)
4638355772470722560 (123)
4921056587992461136 (1e+21)
In single precision floating point, you get 8 bits in which to store the exponent. Instead of storing it as a signed two's complement number, it was decided that it'd be easier to just add 127 to the exponent (since the lowest it could be in 8 bit signed is -127) and just store it as an unsigned number. If the stored value is greater than the bias, that means the value of the exponent is positive, if it's lower than the bias, it's negative, if it's equal, it's zero.
Adding more detail to above answers.
To represent 0
, infinity
and NaN
(Not-a-Number) in floating point, IEEE decided to use special encoding values.
If all bits of the exponent field are set to 0, then the floating-point number is 0.0.
If all bits of the exponent field are set to 1 and all bits of the fraction part are 0, then the floating-point number is infinity.
If all bits of the exponent field are set to 1 and all bits of the fraction part are not equal to 0, then the floating point number is NaN.
So, in single-precision we have 8 bits to represent the exponent field and there are 2 special values, so we basically have 256 - 2 = 254
values that can be represented in exponent. So, we can effectively represent -126 to 127 in the exponent, i.e., 254 values (126 + 127 + 1), 1 is added for 0.
To specifically address your confusion: an exponent can seemingly become negative because of the bias. If you see a binary value of +125 in the exponent range, after you "unbias" it, the actual exponent value is -2. This can occur because to be "biased" in this context means to subtract 127. There are times though, the exponent will remain positive even after subtracting 127. If you are just looking at bits:
[0][01111111][00000000000000000000000]
is the number zero! Even though you see all those 1s in there. These examples are for single-precision (32 bit) floating-point numbers for processors using the IEEE 754 standard. When using this standard values are stored like this:
[sign][biased exponent][significand]
The SIGN bit is used for the fraction portion of a floating-point number NOT the exponent. Your eyes have to move the sign bit and the exponent to make those numbers look more natural, say 1.01x2^5 like you'd see in math class. By the way, 1.01x2^5 is considered a "normal" number because there is only 1 digit to the left of the binary point and of course, this version of scientific notation multiplies by 2 not 10 because we are using base 2, which makes moving the binary point easy!
Let's look at an example like the decimal 0.15625, first I'll visually move the exponent:
----------------------------------(exponent)
0 01111100 01000000000000000000000 ^
--+------+-+----------------------- |
| | |
+------+ |
subtract 127 here |
| |
v |
---------------->--------------
Here the exponent is 124, so subtract 127 to get -3. Now remember the implied 1 so you'll now have 1.01000000000000000000000. Forget all those zeros: 1.01x2^-3 is the binary number 0.001010. Also remember, the first bit was a zero so the "finalized" number is a positive 0.15625. We could easily have -0.15625 if we had 1 01111100 01000000000000000000000
to start with.
Here are those special cases noted above, and yes there is a positive and negative infinity:
31
|
| 30 23 22 0
| | | | |
-----+-+------+-+---------------------+
qnan 0 11111111 10000000000000000000000
snan 0 11111111 01000000000000000000000
inf 0 11111111 00000000000000000000000
-inf 1 11111111 00000000000000000000000
-----+-+------+-+---------------------+
| | | | |
| +------+ +---------------------+
| | |
| v v
| exponent fraction
|
v
sign
I found all this in the Intel Manual, Table 4-3 on pg 91 in the 4 vol set.
精彩评论