开发者

Free/open source C/C++ library of vectorized math functions? [closed]

开发者 https://www.devze.com 2023-03-25 09:21 出处:网络
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.

Closed 9 years ago.

Improve this question

I'm looking for a free/open source C/C++ (either is acceptable) library of vectorized versions of common math functions (such a开发者_运维知识库s ln or exp) similar to Intel's Vector Math Library for Linux. I'd like a library that would provide me with the ability to write something like:

double a[ARRAY_SIZE], b[ARRAY_SIZE];
for (int i = 0; i < ARRAY_SIZE; ++i) {
    a[i] = ln(b[i]);
}

as:

double a[ARRAY_SIZE], b[ARRAY_SIZE];
vectorized_ln(a, b, ARRAY_SIZE);

and have it use the full power of the SIMD instructions available on the Intel and AMD architectures. The development environment consists of GNU tools running on Linux. Intel's Math Kernel Library contains something called Vector Math Library which advertises "vector implementations of computationally intensive core mathematical functions" including basic functions, trig functions, etc, so I'm looking for something like that but for free.


I developed an open-source (BSD) Yeppp! mathematical library, which provides some vector elementary functions (log, exp, sin, cos, tan), and is competitive with MKL in performance. Here is an example of using vector logarithm function from Yeppp!


Felix von Leitner has written an extensive presentation on the actual assembly produced by various c compilers.

His notes on vectorization of simple operations start on slide 28.

  • For GCC 4.4 and a memset type loop

    • gcc -O2 generates a loop that writes one byte at a time
    • gcc -O3 vectorizes, writes 32-bit (x86) or 128-bit (x86 wit h SSE or x64) at a time
    • impressive: the vectorized code checks and fixes the alignment first

Slide 41 is entitled "Outsmarting the Compiler - simd-shift" and concludes that "gcc is smarter than the video codec programmer on all platforms"

Slide 42 is another case where gcc will automatically vectorize naive code.

All of which adds up to check first to see if the compiler you are using will simply deal with it for you.


you might find AMD's LibM Library (it is for x64 however) combined with SSEPlus to be of use. There is also an opensource x86 variant of Sony's Vector Math library.


Besides writing these functions yourself (which isn't that much rocket science) or using Ignacio's link..

It might be that Intel's SPMD compiler is something for you: http://ispc.github.com/

It's a C-style compiler in which you write stuff in serial/scalar fashion and it will parallelize them with a certain target architecture in mind. The resulting functions are easy to call from your regular CPP project.

I quote: "ispc compiles a C-based SPMD programming language to run on the SIMD units of CPUs; it frequently provides a 3x or more speedup on CPUs with 4-wide SSE units, without any of the difficulty of writing intrinsics code."

I yet have to try it myself but it looks good for generic calc. parallelization.

0

精彩评论

暂无评论...
验证码 换一张
取 消