mardi 21 juin 2016

Why is the same C program sometimes much faster

Well, this will be a question without details because I do not know how to explain better. Sorry. I have a memory intensive C-program (a lot of pointers). I have a source, it is compiled by me with gcc -O2. I am on Ubuntu Linux. On start and end of the program there is a call to clock() to measure elapsed time. Moreover, I am using time command to check the time. The problem is that the same program is sometimes more than 20% faster (or slower) without changing anything.

$ date; time ./cudd-example-8queens 9
pon jun 20 00:49:05 CEST 2016
CPU TIME = 6.46
real    0m6.475s
user    0m6.405s
sys 0m0.067s

$ date; time ./cudd-example-8queens 9
pon jun 20 00:49:16 CEST 2016
CPU TIME = 8.03
real    0m8.051s
user    0m7.995s
sys 0m0.048s

$ date; time ./cudd-example-8queens 9
pon jun 20 00:49:33 CEST 2016
CPU TIME = 6.48
real    0m6.490s
user    0m6.445s
sys 0m0.040s

$ date; time ./cudd-example-8queens 9
pon jun 20 00:49:42 CEST 2016
CPU TIME = 6.45
real    0m6.469s
user    0m6.424s
sys 0m0.040s

$ date; time ./cudd-example-8queens 9
pon jun 20 00:49:56 CEST 2016
CPU TIME = 8.04
real    0m8.058s
user    0m7.982s
sys 0m0.068s

My question is: how to explain this differences, i.e. where are this extra 1.5s (sometimes get even worst) spent? It must be something with memory access but how to check this? Please note, that I am on a desktop computer with usual Intel Pentium and the program is terminal only (no graphic).

EDIT: I have installed perf and here are two results. Regarding the goals, I am comparing scientific algorithms and it is important for me if e.g. one is 10% faster than other.

$ date; perf stat -B ./cudd-example-8queens 
pon jun 20 01:28:39 CEST 2016
CPU TIME = 6.99

 Performance counter stats for './cudd-example-8queens':

       6998,021975 task-clock (msec)         #    0,999 CPUs utilized          
               694 context-switches          #    0,099 K/sec                  
                 0 cpu-migrations            #    0,000 K/sec                  
             30860 page-faults               #    0,004 M/sec                  
       17913569772 cycles                    #    2,560 GHz                    
   <not supported> stalled-cycles-frontend 
   <not supported> stalled-cycles-backend  
        7359861204 instructions              #    0,41  insns per cycle        
        1227791153 branches                  #  175,448 M/sec                  
          64508458 branch-misses             #    5,25% of all branches        

       7,007481527 seconds time elapsed

$ date; perf stat -B ./cudd-example-8queens 
pon jun 20 01:28:49 CEST 2016
CPU TIME = 8.32

 Performance counter stats for './cudd-example-8queens':

       8331,293343 task-clock (msec)         #    0,998 CPUs utilized          
               813 context-switches          #    0,098 K/sec                  
                18 cpu-migrations            #    0,002 K/sec                  
             30863 page-faults               #    0,004 M/sec                  
       13978945354 cycles                    #    1,678 GHz                    
   <not supported> stalled-cycles-frontend 
   <not supported> stalled-cycles-backend  
        7366350507 instructions              #    0,53  insns per cycle        
        1228922352 branches                  #  147,507 M/sec                  
          64512487 branch-misses             #    5,25% of all branches        

       8,343932138 seconds time elapsed

Aucun commentaire:

Enregistrer un commentaire