Fixed-point multiplication in a known range_问答_开发者

Fixed-point multiplication in a known range

开发者 https://www.devze.com 2023-01-08 07:36 出处：网络

I\'m trying to multiply A*B in 16-bit fixed point, while keeping as much accuracy as possible. A is 16-bit in unsigned integer range, B is divided by 1000 and always between 0.001 and 9.999. It\'s bee

I'm trying to multiply A*B in 16-bit fixed point, while keeping as much accuracy as possible. A is 16-bit in unsigned integer range, B is divided by 1000 and always between 0.001 and 9.999. It's been a while since I dealt with problems like that, so:

I know I can just do A*B/1000 after moving to 32-bit variables, then strip back to 16-bit
I'd like to make it faster than that
I'd like to do all the operations without moving to 32-bit (since I've got 16-bit multiplication only)

Is there any easy way to do that?

Edit: A will be between 0 and 4000, so all possible results are in the 16-bit开发者_运维技巧 range too.

Edit: B comes from user, set digit-by-digit in the X.XXX mask, that's why the operation is /1000.

No, you have to go to 32 bit. In general the product of two 16 bit numbers will always give you a 32 bit wide result.

You should check the CPU instruction set of the CPU you're working on because most multiply instructions on 16 bit machines have an option to return the result as a 32 bit integer directly.

This would help you a lot because:

short testfunction (short a, short b)
{
  int A32 = a;
  int B32 = b;

  return A32*B32/1000
}

Would force the compiler to do a 32bit * 32bit multiply. On your machine this could be very slow or even done in multiple steps using 16bit multiplies only.

A little bit of inline assembly or even better a compiler intrinsic could speed things up a lot.

Here is an example for the Texas Instruments C64x+ DSP which has such intrinsics:

short test (short a, short b) 
{
  int product = _mpy (a,b); // calculates product, returns 32 bit integer
  return product / 1000;
}

Another thought: You're dividing by 1000. Was that constant your choice? It would be much faster to use a power of two as the base for your fixed-point numbers. 1024 is close. Why don't you:

  return (a*b)/1024

instead? The compiler could optimize this by using a shift right by 10 bits. That ought to be much faster than doing reciprocal multiplication tricks.