I'm writing a SSE code to 2-D convolution but SSE documentation is very sparse.
I'm calculating dot-product with _mm_dp_ps
and using _mm_extract_ps
开发者_运维百科to get the dot-product result but _mm_extract_ps
returns a hex float
and I can't figure out how to convert this hex float
to a regular float
.
I could use __builtin_ia32_vec_ext_v4sf
that returns a float
but I wanna keep compatibility with others compilers.
_mm_extract_ps (__m128 __X, const int __N)
{
union { int i; float f; } __tmp;
__tmp.f = __builtin_ia32_vec_ext_v4sf ((__v4sf)__X, __N);
return __tmp.i;
}
What point I'm missing?
A little help will be appreciated, thanks.
OpenSUSE 11.2, GCC 4.4.1, C++
Compiler options: -fopenmp -Wall -O3 -msse4.1 -march=core2
Linker options: -lgomp -Wall -O3 -msse4.1 -march=core2
You should be able to use _MM_EXTRACT_FLOAT
.
Incidentally it looks to me as if _mm_extract_ps
and _MM_EXTRACT_FLOAT
should be the other way around, i.e. _mm_extract_ps
should return a float and _MM_EXTRACT_FLOAT
should return the int representation, but what do I know.
_mm_cvtss_f32(_mm_shuffle_ps(__X, __X, __N))
will do the job.
And just to exemplify all that has been mentioned so far:
main.c
#include <assert.h>
#include <x86intrin.h>
int main(void) {
/* 32-bit. */
{
__m128 x = _mm_set_ps(1.5f, 2.5f, 3.5f, 4.5f);
/* _MM_EXTRACT_FLOAT */
float f;
_MM_EXTRACT_FLOAT(f, x, 3);
assert(f == 1.5f);
_MM_EXTRACT_FLOAT(f, x, 2);
assert(f == 2.5f);
_MM_EXTRACT_FLOAT(f, x, 1);
assert(f == 3.5f);
_MM_EXTRACT_FLOAT(f, x, 0);
assert(f == 4.5f);
/* _mm_cvtss_f32 + _mm_shuffle_ps */
assert(_mm_cvtss_f32(x) == 4.5f);
assert(_mm_cvtss_f32(_mm_shuffle_ps(x, x, 1)) == 3.5f);
assert(_mm_cvtss_f32(_mm_shuffle_ps(x, x, 2)) == 2.5f);
assert(_mm_cvtss_f32(_mm_shuffle_ps(x, x, 3)) == 1.5f);
}
/* 64-bit. */
{
__m128d x = _mm_set_pd(1.5, 2.5);
/* _mm_cvtsd_f64 + _mm_unpackhi_pd */
assert(_mm_cvtsd_f64(x) == 2.5);
assert(_mm_cvtsd_f64(_mm_unpackhi_pd(x, x)) == 1.5);
}
}
GitHub upstream.
Compile and run:
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o main.out main.c
./main.out
Doubles mentioned at: _mm_cvtsd_f64 analogon for higher order floating point
Tested on Ubuntu 19.04 amd64.
extern void _mm_store_ss(float*, __m128);
See 'xmmintrin.h.'
精彩评论