开发者

x86 4byte floats vs. 8byte doubles (vs. long long)?

开发者 https://www.devze.com 2023-01-24 12:41 出处:网络
We have a measurement data processing application and currently all data is held as C++ float which means 32bit/4byte on our x86/Windows platform. (32bit Windows Application).

We have a measurement data processing application and currently all data is held as C++ float which means 32bit/4byte on our x86/Windows platform. (32bit Windows Application).

Since precision is becoming an issue, there have been discussions to move to another datatype. The options currently discussed are switching to double (8byte) or implementing a fixed decimal type on top of __int64 (8byte).

The reason the fixed-decimal solution using __int64 as underlying type is even discussed is that someone claimed that double performance is (still) significantly worse than processing floats and that we might see significant performance benefits using a native integer type t开发者_运维知识库o store our numbers. (Note that we really would be fine with fixed decimal precision, although the code would obviously become more complex.)

Obviously we need to benchmark in the end, but I would like to ask whether the statement that doubles are worse holds any truth looking at modern processors? I guess for large arrays doubles may mess up cache hits more that floats, but otherwise I really fail to see how they could differ in performance?


It depends on what you do. Additions, subtractions and multiplies on double are just as fast as on float on current x86 and POWER architecture processors. Divisions, square roots and transcendental functions (exp, log, sin, cos, etc.) are usually notably slower with double arguments, since their runtime is dependent on the desired accuracy.

If you go fixed point, multiplies and divisions need to be implemented with long integer multiply / divide instructions which are usually slower than arithmetic on doubles (since processors aren't optimized as much for it). Even more so if you're running in 32 bit mode where a long 64 bit multiply with 128 bit results needs to be synthesized from several 32-bit long multiplies!

Cache utilization is a red herring here. 64-bit integers and doubles are the same size - if you need more than 32 bits, you're gonna eat that penalty no matter what.


Look it up. Both and Intel publish the instruction latencies for their CPUs in freely available PDF documents on their websites.

However, for the most part, performance won't be significantly different, or a couple of reasons:

  • when using the x87 FPU instead of SSE, all floating point operations are calculated at 80 bits precision internally, and then rounded off, which means that the actual computation is equally expensive for all floating-point types. The only cost is really memory-related then (in terms of CPU cache and memory bandwidth usage, and that's only an issue in float vs double, but irrelevant if you're comparing to int64)
  • with or without SSE, nearly all floating-point operations are pipelined. When using SSE, the double instructions may (I haven't looked this up) have a higher latency than their float equivalents, but the throughput is the same, so it should be possible to achieve similar performance with doubles.

It's also not a given that a fixed-point datatype would actually be faster either. It might, but the overhead of keeping this datatype consistent after some operations might outweigh the savings. Floating-point operations are fairly cheap on a modern CPU. They have a bit of latency, but as mentioned before, they're generally pipelined, potentially hiding this cost.

So my advice:

  1. Write some quick tests. It shouldn't be that hard to write a program that performs a number of floating-point ops, and then measure how much slower the double version is relative to the float one.
  2. Look it up in the manuals, and see for yourself if there's any significant performance difference between float and double computations


I've trouble the understand the rationale "as double as slower than float we'll use 64 bits int". Guessing performance has always been an black art needing much of experience, on today hardware it is even worse considering the number of factors to take into account. Even measuring is difficult. I know of several cases where micro-benchmarks lent to one solution but in context measurement showed that another was better.

First note that two of the factors which have been given to explain the claimed slower double performance than float are not pertinent here: bandwidth needed will the be same for double as for 64 bits int and SSE2 vectorization would give an advantage to double...

Then consider than using integer computation will increase the pressure on the integer registers and computation units when apparently the floating point one will stay still. (I've already seen cases where doing integer computation in double was a win attributed to the added computation units available)

So I doubt that rolling your own fixed point arithmetic would be advantageous over using double (but I could be showed wrong by measures).


Implementing 64 fixed points isn't really fun. Especially for more complex functions like Sqrt or logarithm. Integers will probably still a bit faster for simple operations like additions. And you'll need to deal with integer overflows. And you need to be careful when implementing rounding, else errors can easily accumulate.

We're implementing fixed points in a C# project because we need determinism which floatingpoint on .net doesn't guarantee. And it's relatively painful. Some formula contained x^3 bang int overflow. Unless you have really compelling reasons not to, use float or double instead of fixedpoint.

SIMD instructions from SSE2 complicate the comparison further, since they allow operation on several floating point numbers(4 floats or 2 doubles) at the same time. I'd use double and try to take advantage of these instructions. So double will probably be significantly slower than floats, but comparing with ints is difficult and I'd prefer float/double over fixedpoint is most scenarios.


It's always best to measure instead of guess. Yes, on many architectures, calculations on doubles process twice the data as calculations on floats (and long doubles are slower still). However, as other answers, and comments on this answer, have pointed out, the x86 architecture doesn't follow the same rules as, say, ARM processors, SPARC processors, etc. On x86 floats, doubles and long doubles are all converted to long doubles for computation. I should have known this, because the conversion causes x86 results to be more accurate than SPARC and Sun went through a lot of trouble to get the less accurate results for Java, sparking some debate (note, that page is from 1998, things have since changed).

Additionally, calculations on doubles are built in to the CPU where calculations on a fixed decimal datatype would be written in software and potentially slower.

You should be able to find a decent fixed sized decimal library and compare.


With various SIMD instruction sets you can perform 4 single precision floating point operations at the same cost as one, essentially you pack 4 floats into a single 128 bit register. When switching to doubles you can only pack 2 doubles into these registers and hence you can only do two operations at the same time.


As many people have said, a 64bit int is probably not worth it if double is an option. At least when SSE is available. This might be different on micro controllers of various kinds but I guess that is not your application. If you need additional precision in long sums of floats, you should keep in mind that this operation is sometimes problematic with floats and doubles and would be more exact on integers.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号