would doing arithmetic operation on a pair of signed and unsigned numbers be legal?_问答_开发者

I'm more than half way through learning assembly and I'm familiar with the concept of how signed and unsigned integers are presented in bits, I know that it might seem a weird question of which the answer would be pretty obvious, but I'm wondering if using an arithmetic operation like addition makes sense for a pair of numbers that one of them is considered signed and the other one unsigned, I've thought of multiple examples like below that will yield a correct result:

开发者_开发知识库10000001 (1-byte integer and considered unsigned, equivalent to 129)

11111111 (1-byte integer and considered signed(two's complement system), equivalent to -1)

10000000 (1-byte integer and in unsigned logic equivalent to 128)

Now if the upper value was in AL register and we had the following instruction code(in GAS format):

addb -1, %al

then the carry flag(CF) of EFLAGS register will be set after the operation's been done and would inform of an overflow that actually has not happened and maybe because there's one unsigned number in terms of an overflow the overflow flag(OF) of EFLAGS register should be referenced. So I'm confused if doing such thing is ever sensible.

Mathematically, you do not add signed or unsigned number. There are only values modulo 2³² (assuming that you have 32-bit registers). Such values cover a range of 2³² consecutive integers, but you are free to interpret that range as beginning just about anywhere. "Signed" and "unsigned" are just two such interpretations.

In other words, with 4-bit registers, the unsigned interpretation of "1011" is eleven, while the signed interpretation is minus-five. But there is only one value (which mathematicians usually call "eleven modulo 2⁴" because mathematicians are traditionally fond of unsigned interpretation). For instance, if you add "0110" to that value (which is "six" in both signed and unsigned interpretations), then you get "0001", which is the proper value: minus-five plus six yield one, and eleven plus six is seventeen which is also equal to one when reduced modulo 2⁴ (seventeen is one plus sixteen; "reducing modulo 2⁴" is about dividing by sixteen [that's 2⁴] and keeping the remainder only).

Another way to say that is the following: the number of (binary) digits for a numerical value is conceptually infinite to the left. The CPU register only keeps the 32 rightmost bits. The unsigned interpretation is about assuming, conventionally, that all the leftmost bits are zero. The signed interpretation is about assuming, conventionally, that all the leftmost bits have the same value than the bit 31 (i.e. all are zero, or all are one). Either way, when you perform an addition (or a subtraction or a multiplication), carries propagate from right to left, not the other way round, so the values of those ignored bits have no bearing whatsoever on the 32-bit result. So there is only one "add" opcode, which does not care the slightest bit about whether its operands are, in the brain of the programmer, "signed" or "unsigned".

Signedness must be taken into account when performing an operation which is not compatible with modulo arithmetics. Conversion into a sequence of decimal digits for display is such an operation. A more frequent case, however, is comparisons. Values modulo 2³² are not ordered; they are in a kind of cyclic loop (when you add 1 to 2³²-1, and reduce modulo 2³², you get back to 0). Comparisons make sense only when you consider integers in the whole range of integers. At that point, you must decide whether you use the signed or unsigned interpretation. Which is why x86 processors offer both jg (jump if greater, signed interpretation) and ja (jump if above, unsigned interpretation).

Whether a number or operation is signed or unsigned is just a matter of interpretation. What will happen when you do the add is that that two numbers get added together to make 10000000 with a 1 in the carry flag (because it “went off the front end”). It's then up to your subsequent operations to interpret what that means (if you use the bit elsewhere, it's like you're treating the operation as an unsigned add without wrapping; if you throw the bit away, it's as if you were doing a signed add).

At binary level, there is only one addition operation:

 0101 + (5)
 1010 = (unsigned 10 or signed -6)
--------
 1111   (unsigned 15 or signed -1)

What about the carry and overflow flags, they are both set according to a simple rules. CF can be used to detect an oveflow iff we consider that the operands were unsigned, and OF to detect the oveflow iff we consider both of them signed. Both these flags are set according to the result, and it's up to you to decide which of them to use.

The actual formula for OF flag is

OF = CF xor MSB_of_result.

This means that if we are adding two positive numbers (that we consider signed), then if the result is negative, then it oveflowed.

"Signed" and "unsigned" are interpretations. An assembly instruction will generally have the interpretation documented. I am not aware of any architecture where there's an ADD-SIGNED-UNSIGNED instruction that interprets one of its arguments as a signed value and one as unsigned. There seems to be little value in it, too. With 2s complement integer arithmetic, the only difference would be in some flag registers anyway.

I found this very nice article on the issue that was my main concern and the answer is clear after reading the article.