Well, this will be a question without details because I do not know how to explain better. Sorry. I have a memory intensive C-program (a lot of pointers). I have a source, it is compiled by me with gcc -O2. I am on Ubuntu Linux. On start and end of the program there is a call to clock() to measure elapsed time. Moreover, I am using time command to check the time. The problem is that the same program is sometimes more than 20% faster (or slower) without changing anything.
$ date; time ./cudd-example-8queens 9
pon jun 20 00:49:05 CEST 2016
CPU TIME = 6.46
real 0m6.475s
user 0m6.405s
sys 0m0.067s
$ date; time ./cudd-example-8queens 9
pon jun 20 00:49:16 CEST 2016
CPU TIME = 8.03
real 0m8.051s
user 0m7.995s
sys 0m0.048s
$ date; time ./cudd-example-8queens 9
pon jun 20 00:49:33 CEST 2016
CPU TIME = 6.48
real 0m6.490s
user 0m6.445s
sys 0m0.040s
$ date; time ./cudd-example-8queens 9
pon jun 20 00:49:42 CEST 2016
CPU TIME = 6.45
real 0m6.469s
user 0m6.424s
sys 0m0.040s
$ date; time ./cudd-example-8queens 9
pon jun 20 00:49:56 CEST 2016
CPU TIME = 8.04
real 0m8.058s
user 0m7.982s
sys 0m0.068s
My question is: how to explain this differences, i.e. where are this extra 1.5s (sometimes get even worst) spent? It must be something with memory access but how to check this? Please note, that I am on a desktop computer with usual Intel Pentium and the program is terminal only (no graphic).
EDIT: I have installed perf and here are two results. Regarding the goals, I am comparing scientific algorithms and it is important for me if e.g. one is 10% faster than other.
$ date; perf stat -B ./cudd-example-8queens
pon jun 20 01:28:39 CEST 2016
CPU TIME = 6.99
Performance counter stats for './cudd-example-8queens':
6998,021975 task-clock (msec) # 0,999 CPUs utilized
694 context-switches # 0,099 K/sec
0 cpu-migrations # 0,000 K/sec
30860 page-faults # 0,004 M/sec
17913569772 cycles # 2,560 GHz
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
7359861204 instructions # 0,41 insns per cycle
1227791153 branches # 175,448 M/sec
64508458 branch-misses # 5,25% of all branches
7,007481527 seconds time elapsed
$ date; perf stat -B ./cudd-example-8queens
pon jun 20 01:28:49 CEST 2016
CPU TIME = 8.32
Performance counter stats for './cudd-example-8queens':
8331,293343 task-clock (msec) # 0,998 CPUs utilized
813 context-switches # 0,098 K/sec
18 cpu-migrations # 0,002 K/sec
30863 page-faults # 0,004 M/sec
13978945354 cycles # 1,678 GHz
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
7366350507 instructions # 0,53 insns per cycle
1228922352 branches # 147,507 M/sec
64512487 branch-misses # 5,25% of all branches
8,343932138 seconds time elapsed
Aucun commentaire:
Enregistrer un commentaire