I've started playing around with AVX instructions on the new Intel's Sandy Bridge processor. I'm using GCC 4.5.2, TDM-GCC 64bit build of MinGW64.
I want to overload operator<< for ostream to be able to print out the vector types __m256
, __m128
etc to the console. But I'm running into an overloading conflict. The 2nd function in the following code produces an error "conflicts with previous declaration void f(__vector(8) float)
":
void f(__开发者_开发百科m128 v) {
cout << 4;
}
void f(__m256 v) {
cout << 8;
}
It seems that the compiler cannot distinguish between the two types and consideres them both f(float __vector)
.
Is there a way around this? I haven't been able to find anything online. Any help is greatly appreciated.
I accidentally stumbled upon the answer when having a similar problem with function templates. In this case, the GCC error message actually suggested a solution:
add -fabi-version=4
compiler option.
This solves my problem, and hopefully doesn't cause any issues when linking the standard libraries.
One can read more about ABI (Application Binary Interface) and GCC at ABI Policy and Guidelines and ABI specification. ABI specifies how the functions names are mangled when the code is compiled into object files. Apparently, ABI version 3 used by GCC by default cannot distinguish between the various vector types.
I was unsatisfied with the solution of changing compiler ABI flags to solve this, so I went looking for a different solution. It seems they encountered this issue in writing the Eigen library - see this source file for details http://eigen.tuxfamily.org/dox-devel/SSE_2PacketMath_8h_source.html
My solution to this is a slightly tweaked version of theirs:
template <typename T, unsigned RegisterSize>
struct Register
{
using ValueType = T;
enum { Size = RegisterSize };
inline operator T&() { return myValue; }
inline operator const T&() const { return myValue; }
inline Register() {}
inline Register(const T & v) : myValue(v) {} // Not explicit
inline Register & operator=(const T & v)
{
myValue = v;
return *this;
}
T myValue;
};
using Register4 = Register<__m128, 4u>;
using Register8 = Register<__m256, 8u>;
// Could provide more declarations for __m128d, __m128i, etc. if needed
Using the above, you can overload on Register4
, Register8
, etc. or produce template functions taking Register
s without running into linking issues and without changing ABI settings.
精彩评论