Benchmarking an application in a fully loaded machine_问答_开发者

Benchmarking an application in a fully loaded machine

开发者 https://www.devze.com 2023-02-09 04:05 出处：网络

I need to \"time\" or benchmark a number crunching application written in C/C++. The problem is that the machine where I run the program is usually full of people doing similar things, so the CPUs are

相关专题：benchmarking c

I need to "time" or benchmark a number crunching application written in C/C++. The problem is that the machine where I run the program is usually full of people doing similar things, so the CPUs are always at full load.

I thought about using functions from time.h liket "get time of the day" (don't remember the exact syntax, sorry) and similars, but I am afraid they would not be good for this case, am I right?

And the program "time" from bash, gave me some errors long time ago.

Also the problem is开发者_C百科, that sometimes I need to get timings in the range of 0.5 secs and so on.

Anybody has a hint?

P.S.: compiler is gcc and in some cases nvcc (NVIDIA) P.S.2: in my benchmarks I just want to measure the execution time between two parts of the main function

You didn't mention which compiler you are using, but with GNU's g++ I usually set the -pg flag to build with profiling informations.

Each time you run the application, it will create an output file that, parsed with gprof application, will give you lots of information about the performances.

See this for starters.

From your other recent questions, you seem to be using MPI for parallelisation. Assuming this question is within the same context, then the simplest way to time your application would be to use MPI_Wtime().

From the man page:

This subroutine returns the current value of time as a double precision floating point number of seconds. This value represents elapsed time since some point in the past. This time in the past will not change during the life of the task. You are responsible for converting the number of seconds into other units if you prefer.

Example usage:

#include "mpi.h"

int main(int argc, char **argv)
{
    int rc, taskid;
    double t_start, t_end;

    MPI_Init(&argc,&argv);
    MPI_Comm_rank(MPI_COMM_WORLD,&taskid); 

    t_start = MPI_Wtime();

    /* .... your computation kernel .... */

    t_end = MPI_Wtime();

    /* make sure all processes have completed */
    MPI_Barrier(MPI_COMM_WORLD);

    if (taskid == 0) {
        printf("Elapsed time: %1.2f seconds\n", t_start - t_end);
    }

    MPI_Finalize();
    return 0;
}

The advantage of this is that we let the underlying MPI library handle platform specific ways of handling time, although you might want to use MPI_Wtick() to determine the resolution of the timer used on each platform.

It's hard to meaningfully compare timings from programs running for such a short time. Usually the solution is to run multiple times.

The time builtin in bash (or /usr/bin/time) will report time actually used by the processor, which will be more useful on a loaded machine than wall-clock time, but there is too much going on to really compare timings on a fine grain – huge differences of orders of magnitude will still be apparent.

You can also use clock to get a rough estimate:

#include <ctime>
#include <iostream>

struct Timer {
  std::clock_t _start, _stop;
  Timer() : _start(std::clock()) {}
  void restart() { _start = std::clock(); }
  void stop() { _stop = std::clock(); }
  std::clock_t clocks() const { return _stop - _start; }
  double secs() const { return double(clocks()) / CLOCKS_PER_SEC; }
};

int main() {
  Timer t;
  //run_some_code();
  t.stop();
  std::cout << "That took " << t.secs() << " seconds.\n";
  return 0;
}