Java Math.tanh() performance_问答_开发者_运维开发者技术经验分享

I have a Java program that does many calls to the Math.tanh() function. Out of curiosity I wanted to do a comparison with C++. Therefore I wrote two small programs, one in Java and one in C++, to test.

The Java code:

public class TestTanh { 

    public static void main(String[] args) {

        double t1 = -1.0;
        double t2 = 1.0;
        double step = 1e-8;

        double z = 0.0;
        for(double t=t1; t<=t2; t += step) {
            double y = Math.tanh(t);
            z += y;
        }
        System.out.println("Sum = " + z);
    }
}

and the C++ code:

#include <iostream>
#include <cmath>

using namespace std;

int main() {

    double t1 = -1.0;
    double t2 = 1.0;
    double step = 1e-8;

    double z = 0.0;
    for(double t=t1; t<=t2; t += step) {
        double y = tanh(t);
        z += y;
    }
    cout << "Sum = " << z << "\n";
}

Compiling and running the programs I got the following:

$ time java TestTanh
Sum = -0.41281032759865655

real    0m18.372s
user    0m17.961s
sys     0m0.109s

and

$ time ./test_tanh
Sum = -0.41281

real    0m4.022s
user    0m3.641s
sys     0m0.004s

Why does the Java program take about 5 times more time to execute? Could it be related to the JIT doing some compiling first? Or is the tanh implementation in Java slower than the C++?

It is a simple test that might have a trivial explanation but I have searched the web and not found an answer. My Java version is

$ java -version
java version "1.6.0_22"
Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-10M3261)
Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode)

When using the tanh function in a larger program containing other basic arithmetic operations the difference between Java and C++ became smaller (now about 2.3). The program still calls tanh repeatedly, but now there are also other operations in the loop. I also tried the FastMath cl开发者_高级运维ass from Apache Commons, but is was actually slower (any special settings needed?). The result for this program with identical parameters were:

C++

real    0m18.031s
user    0m18.007s
sys     0m0.007s

Java with lang.Math

real    0m40.739s
user    0m40.032s
sys     0m0.088s

Java with org.apache.commons.math.util.FastMath

real    0m46.717s
user    0m46.583s
sys     0m0.372s

My goal here was not to do any true benchmarking, I just wanted to see what the differences were in a practical situation when implementing the code in a straightforward way.

According to this, OpenJDK 6 (and I guess, Sun's JDK 6) uses strict math, which sacrifices performance for correctness. That might be your problem. I am pretty sure that no decent JVM spends 18 seconds in starting. You should use a Math library with performance in mind, or change your JVM.

It may or may not come from the fact, that the result in Java is quite precisely defined. The following can cost time unless the CPU does it in exactly the same way:

If the argument is zero, then the result is a zero with the same sign as the argument.

If the argument is positive infinity, then the result is +1.0.

The computed result must be within 2.5 ulps of the exact result.

In C, you only know it comes something like tanh(x) out. There are standards, and there are standard conforming compilers, but do you use such a compiler?

As said in the other answers, this is not the way how benchmarks should be done. For comparison of Java programs only I recommend you to grab caliper and try with it.

When doing performance tests like this, you should always allow for "warmup period". So you shouldn't start measuring before having run the calculations a couple of hundred or thounsand times. This way the Java Hotspot compiler will have compiled what it considers being frequently executed code, and the native binary will have put its most frequently used variables in processor registers.

My guess is that close to 100% of the difference in your result is due to the slow startup time of the JVM. Java uses ages to start compared to a natively compiled program.

It would be interesting to see a measurement actually done in code, after a "warmup period".

For most C++ compilers, tanh is an intrinsic function (a built-in function), meaning there is no function call overhead because the compiler adds specific assembler instructions instead of a library call.

Some Java VMs also support intrinsic functions, for example String.length() seems to be intrinsic in the Sun JVM. For Java, that means the hotspot compiler replaces the function call with special assembler instructions (at runtime). This is a bit different from C and C++ where the compiler does that (before running the program).

However, in Java, Math.tanh doesn't seem to be intrinsic. Therefore, it's slower.

The main method is called only once, so the JVM might not compile it to native code. Read first How do I write a correct micro-benchmark in Java?

If there is still a big difference after the micro benchmark is well written, then a possible reason is JNI overhead. The Math.tanh() method and others have been implemented in the JVM as native code. The documentation for java.lang.StrictMath says that they use the fdlibm library, which is written in C, so you might do well to use that library in your tests, so that you would not be comparing two different C libraries.