开发者

Compiling Eigen library for iPhone with vectorisation

开发者 https://www.devze.com 2023-03-10 19:02 出处:网络
I am struggling with the compilation of Eigen library for iPhone 4 which has an ARM processor with armv7 instruction set. Everything works fine so far when I specify the preprocessor define EIGEN_DONT

I am struggling with the compilation of Eigen library for iPhone 4 which has an ARM processor with armv7 instruction set. Everything works fine so far when I specify the preprocessor define EIGEN_DONT_VECTORIZE. But due to some performance issues I would like to use armv7 optimised code.

Regardless which compiler I use LLVM-GCC 4.2 or LLVM CLang 2.0, I always r开发者_开发知识库un into compilation errors. I figured out (or better think so), that LLVM-GCC 4.2 is the only way to get access to these ARM-NEON specific instructions.

When I do not set EIGEN_DONT_VECTORIZE (and provide -mfloat-abi=softfp -mfpu=neon to gcc) I get the following gcc compiler error:

src/m3CoreLib/Eigen/src/Core/arch/NEON/PacketMath.h:89: error: expected unqualified-id before '__ extension__'

I have read about issues using the "old" gcc 4.2 and the recommendation to use a newer version of gcc. I am not sure but I believe this is not an option because of app store approval. Is there anything else I can do to get it compiled for iPhone.? Anybody out there who solved this?

Thanks, Kay


After fiddling around with different compiler settings hours and hours I found myself a satisfying solution and came to following conclusion.

There is a surprisingly huge difference between debug and release settings regarding Eigen's template library approach: Release settings with usual optimisation flags enabled let the application run 20 to 40 times faster than debug. I have never seen such a difference before in any language, from my experience it is usually 1.5 - 3.

Although I still cannot force vectorisation i.e. code compiles only with EIGEN_DONT_VECTORIZE defined, the resulting performance fits my needs now.

0

精彩评论

暂无评论...
验证码 换一张
取 消