I have an application created using VC++, and wanted to explore optimization opprtunity开发者_运维技巧 by vectorizing some operations.
To begin with, I am trying the following code :
__m128i p1;
p1.m128i_u32[0] = 1;
p1.m128i_u32[1] = 2;
p1.m128i_u32[2] = 3;
p1.m128i_u32[3] = 4;
__m128i p2;
p2.m128i_u32[0] = 1;
p2.m128i_u32[1] = 2;
p2.m128i_u32[2] = 3;
p2.m128i_u32[3] = 4;
__m128i res2= _mm_mul_epi32(p1,p2);
However, I am getting unhandled exception or illegal operation error when _mm_mul_epi32 is executed, I have no clue why it occurs. Can someone please tell what is wrong?
_mm_mul_epi32
maps to the PMULDQ
instruction, which is only available in SSE4 and AVX. You need to have a reasonably recent Intel CPU in order to have SSE4 or AVX, e.g. Nehalem, Sandy Bridge (Core i5, i7).
Note also that you might find it easier and more succinct to use intrinsics to initialise SIMD vectors, e.g.
__m128i p1 = _mm_set_epi32(1, 2, 3, 4);
__m128i p2 = _mm_set_epi32(1, 2, 3, 4);
__m128i res2 = _mm_mul_epi32(p1, p2);
Shouldn't you be using the member m128i_i32
instead of m128i_u32
?
This instruction multiplies two sets of 32-bit signed integers.
From MSDN.
If you really need m128i_u32
then you must use _mm_mul_epu32()
instead.
精彩评论