开发者

Explanation for D vs. C++ performance difference

开发者 https://www.devze.com 2023-03-01 13:46 出处:网络
Simple example in D: import std.stdio, std.conv, core.memory; class Foo{ int x; this(int _x){x=_x;} } void main(string args[]) {

Simple example in D:

import std.stdio, std.conv, core.memory;

class Foo{
    int x;
    this(int _x){x=_x;}
}

void main(string args[]) {
    GC.disable();
    int n = to!int(args[1]);
    Foo[] m= new Foo[n];
    for(int i=0;i<n;i++){
    m[i] = new Foo(i);
    }
}

C++ code:

#include <cstdlib>
using namespace std;
class Foo{
public:
    int x;
    Foo(int _x);

};

Foo::Foo(int _x){
    x = _x;
}

int main(int argc, char** argv) {
    int n = atoi(argv[1]);
    Foo** gx = new Foo*[n];
    for(int i=0;i<n;i++){
        gx[i] = new Foo(i);
    }
    return 0;
}

No any comilation flags.

compiling and runing:

>dmd td.d
>time ./td 10000000
>real   0m2.544s

Anlogue example in C++ (gc开发者_如何学运维c), runing:

>time ./tc 10000000
>real   0m0.523s

Why? Such a simple example, and such a big difference: 2.54s and 0.52s.


You're mainly measuring three differences:

  1. The difference between the code generated by gcc and dmd
  2. The extra time D takes to allocate using the GC.
  3. The extra time D takes to allocate a class.

Now, you might think that point 2 is invalid because you used GC.disable();, but this only makes it so that the GC won't collect as it normally does. It does not make the GC disappear entirely and automatically redirect all memory allocations to C's malloc. It still must do most of what it normally does to ensure that the GC knows about the memory allocated, and all that takes time. Normally, this is a relatively insignificant part of program execution (even ignoring the benefits GCs give). However, your benchmark makes it the entirety of the program which exaggerates this effect.

Therefore, I suggest you consider two changes to your approach:

  1. Either switch to using gdc to compare against gcc or switch to dmc to compare to dmd
  2. Make the programs more equivalent. Either have both D and C++ allocate structs on the heap or, at the very least, make it so that D is allocating without touching the GC. If you're optimizing a program for maximum speed, you'd be using structs and C's malloc anyway, regardless of language.

I'd even recommend a 3rd change: since you're interested in maximum performance, you ought to try to come up with a better program entirely. Why not switch to structs and have them located contiguously in memory? This would make allocation (which is, essentially, the entire program) as fast as possible.

Use of your above code running using dmd & dmc on my machine results in the following times:

  • DMC 8.42n (no flags) : ~880ms
  • DMD 2.062 (no flags) : ~1300ms

Modifying the code to the following:

C++ code:

#include <cstdlib>
struct Foo {
    int x;
};

int main(int argc, char** argv) {
    int n = atoi(argv[1]);
    Foo* gx = (Foo*) malloc(n * sizeof(Foo));
    for(int i = 0; i < n; i++) {
        gx[i].x = i;
    }
    free(gx);
    return 0;
}

D code:

import std.conv;
struct Foo{
    int x;
}

void main(string args[]) {
    int n = to!int(args[1]);
    Foo[] m = new Foo[](n);
    foreach(i, ref e; m) {
        e.x = i;
    }
}

Use of my code using DMD & DMC results in the following times:

  • DMC 8.42n (no flags) : ~95ms +- 20ms
  • DMD 2.062 (no flags) : ~95ms +- 20ms

Essentially, identical (I'd have to start using some statistics to give you a better idea of which one is truly faster, but at this scale, it's irrelevant). Notice that using this is much, much faster than a naive approach and D is equally capable of using this strategy. In this case, the run-time difference is negligible, yet we retain the benefits of using a GC and there is definitely far fewer things that could go wrong in the writing of the D code (Notice how your program failed to delete all of its allocations?).

Furthermore, if you absolutely wanted, D allows you to use C's standard library by import std.c.stdlib; This would allow you to truly bypass the GC and achieve maximum performance by using C's malloc, if necessary. In this case, it's not necessary, so I erred on the side of safer, more readable code.


try this one:

import std.stdio, std.conv, core.memory;

class Foo{
    int x = void;
    this(in int _x){x=_x;}
}

void main(string args[]) {
    GC.disable();
    int n = to!int(args[1]);
    Foo[] m= new Foo[n];
    foreach(i; 0..n){
    m[i] = new Foo(i);
    }
}
0

精彩评论

暂无评论...
验证码 换一张
取 消