开发者

Binary translation | Cross compilation

开发者 https://www.devze.com 2023-02-17 05:48 出处:网络
Say you are writing compilers for different architectures. The architectures have different endianness.

Say you are writing compilers for different architectures. The architectures have different endianness. You have memory read and write instructions

Take example of a store instruction, where you want to store the value 0xAA0xBB0xCC0xDD. Now while writing the assembly for this, do you write two different instructions for the different architectures e.g.

For the little endian: st (reg), 0xDD0xCC0xBB0xAA

For the big endian: st (reg), 0xAA0xBB0xCC0xDD开发者_StackOverflow中文版

Or you write the same instruction, say, st, (reg), 0xAA0xBB0xCC0xDD for both the architectures and let the instruction be parsed by the processor such that it takes care of the endianness of the system?

The reason why I ask this question is I don't know what a binary translator would do when it has to translate code between architectures of different endianness. If in Architecture A, you see the following line st, (reg), XY do you convert it into st, (reg), YX for the Architecture B ?? If that is the case, then what happens to memory reads?

I would like to know how to take care of endianness, considering memory reads and writes in binary translation.


Endianess has nothing to do with how memory is read or written, but instead it just means when memory is interpreted as a number, is the most significant byte first or last. It is only the implementation of the arithmetic which makes the difference.

So your binary translator, if such a thing even exist, won't change anything, it is just instructions like ADD, SUB and MUL which interpret numbers differently.


I'm not sure I understand your question fully, but it sounds like you want to translate some assembly-language code or a disassembled binary?

Every assembler I've ever worked with handles the endianness of constants in the sane way. That is to say, if you want to store 0xAABBCCDD, you would write:

st (reg), 0xAABBCCDD

And the assembler will swizzle the contstant if necessary for the appropriate opcode. Where endianness becomes a concern is where you want to store multiple single-byte values using that one operation. Something like writing a short null-terminated string "123" to memory using the same opcode. You have to swizzle that constant in your assembly code to get it output to memory in the right order for little- vs. big-endian systems:

st (reg), 0x31323300 // big-endian
st (reg), 0x00333231 // little-endian

The safe way is to just store the bytes in the order you want them:

stb (reg+0), 0x31
stb (reg+1), 0x32
stb (reg+2), 0x33
stb (reg+3), 0x00

But that takes four instructions, instead.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号