I have to work on a fortran program, which used to be compiled using Microsoft Compaq Visual Fortran 6.6. I would prefer to work with gfortran but I have met lots of problems. The main problem is that the generated binaries have different behaviours. My program takes an input file and then has to generate an output file. But sometimes, when using the binary compiled by gfortran, it crashes before its end, or gives different numerical results. This a program written by researchers which uses a lot of float numbers.
So my question is: what are the differences between these two compilers which could lead to this kind of problem?
edit: My program computes the values of some parameters and there are numerous iterations. At the beginning, everything goes well. After several iterations, some NaN values appear (only when compiled by gfortran).
edit: Think you everybody for your answers. S开发者_运维知识库o I used the intel compiler which helped me by giving some useful error messages. The origin of my problems is that some variables are not initialized properly. It looks like when compiling with compaq visual fortran these variables take automatically 0 as a value, whereas with gfortran (and intel) it takes random values, which explain some numerical differences which add up at the following iterations. So now the solution is a better understanding of the program to correct these missing initializations.
There can be several reasons for such behaviour. What I would do is:
Switch off any optimization
Switch on all debug options. If you have access to e.g. intel compiler, use
ifort -CB -CU -debug -traceback
. If you have to stick to gfortran, usevalgrind
, its output is somewhat less human-readable, but it's often better than nothing.Make sure there are no implicit typed variables, use
implicit none
in all the modules and all the code blocks.Use consistent float types. I personally always use
real*8
as the only float type in my codes. If you are using external libraries, you might need to change call signatures for some routines (e.g., BLAS has different routine names for single and double precision variables).
If you are lucky, it's just some variable doesn't get initialized properly, and you'll catch it by one of these techniques. Otherwise, as M.S.B. was suggesting, a deeper understanding of what the program really does is necessary. And, yes, it might be needed to just check the algorithm manually starting from the point where you say 'some NaNs values appear'.
Different compilers can emit different instructions for the same source code. If a numerical calculation is on the boundary of working, one set of instructions might work, and another not. Most compilers have options to use more conservative floating point arithmetic, versus optimizations for speed -- I suggest checking the compiler options that you are using for the available options. More fundamentally this problem -- particularly that the compilers agree for several iterations but then diverge -- may be a sign that the numerical approach of the program is borderline. A simplistic solution is to increase the precision of the calculations, e.g., from single to double. Perhaps also tweak parameters, such as a step size or similar parameter. Better would be to gain a deeper understanding of the algorithm and possibly make a more fundamental change.
I don't know about the crash but some differences in the results of numerical code in an Intel machine can be due to one compiler using 80-doubles and the other 64-bit doubles, even if not for variables but perhaps for temporary values. Moreover, floating-point computation is sensitive to the order elementary operations are performed. Different compilers may generate different sequence of operations.
Differences in different type implementations, differences in various non-Standard vendor extensions, could be a lot of things.
Here are just some of the language features that differ (look at gfortran and intel). Programs written to fortran standard work on every compiler the same, but a lot of people don't know what are the standard language features, and what are the language extensions, and so use them ... when compiled with a different compiler troubles arise.
If you post the code somewhere I could take a quick look at it; otherwise, like this, 'tis hard to say for certain.
精彩评论