开发者

Reference manual/tutorial for x86 SIMD intrinsics? [closed]

开发者 https://www.devze.com 2023-03-23 17:58 出处:网络
Closed. This question is seeking recommendations for books, tools, software libraries, and more. It does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed. This question is seeking recommendations for books, tools, software libraries, and more. It does not meet Stack Overflow guidelines. It is not currently accepting answers.

We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.

Closed 8 years ago.

Improve this question 开发者_StackOverflow社区

I'm looking into using these to improve the performance of some code but good documentation seems hard to find for the functions defined in the *mmintrin.h headers, can anybody provide me with pointers to good info on these?

EDIT: particularly interested in a very basic tutorial on how to get started.


There's a handy online Intel Intrinsics Guide at https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html - it covers all Intel SIMD stuff from MMX through the various flavours of SSE up to AVX2 et al.

You can also get the following PDFs from Intel:

  • Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A: Instruction Set Reference, A-M (253666-021)

  • Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2B: Instruction Set Reference, N-Z (253667-021)

  • Intel® SSE4 Programming Reference (D91561-001)


This is the best introduction to MMX/SSE programming I ever found. (I've programmed SSE2 for 5 years and I still find this tutorial to be the most conceptually clear.)

http://www.tommesani.com/Docs.html

This is not a complete list of instructions; so once you're ready to learn more, do start reading the Intel intrinsics guide as @PaulR suggests.

One important thing to keep in mind is that MMX/SSE tend to be severely limiting in terms of movements of data (shuffle or arbitrary permutation, or change of single element). This is a limitation of CPU silicon design. Scatter-gather instructions were only added a few years ago, and might not even be available on your customer's computers.

There is a large repertoire of vectorization tricks for MMX/SSE similar to the way http://www.hackersdelight.org/ prescribes tricks for exploiting bit-parallel operations.

0

精彩评论

暂无评论...
验证码 换一张
取 消