I have a recursive function that is searching through all tuples. I would like to be able to specify the depth in the tree to invoke parallelization with OpenMP. I am trying the following:
void backtrack(int, int, int, int *, int)
main
{
int n = 4;
int v = 3;
int depth_to_parallelize = 2;
int *tuple;
tuple = calloc(n,sizeof(int));
backtrack(0,n,v,tuple,depth_to_parallelize);
}
void backtrack(int ell, int n, int v, int *tuple, int depth_to_parallelize)
{
if(ell==n)
{
process(tuple);
}
else
{
#pragma omp parallel for if(ell == depth_of_parallelization) shared(n,ell,depth_of_parallelization,tuple)
for(i=0;i<v;i++)
{
if(ell == depth_of_parallelization)
{
int *local_tuple;
int j;
local_tuple = calloc(n,sizeof(int));
for(j=0;j<ell,j++) local_tuple[i] = tuple[i];
tuple[ell] = i;
backtrack(ell+1,n,v,local_tuple,depth_of_parallelization);
}
else
{
tuple[ell] = i;
backtrack(ell+1,n,v,tuple,depth_of_parallelization);
}
}
}
}
The for loop is getting sent to three (v) threads but then all remaining calls of backtrack get processed by thread zero. I have found discussion of using openMP tasks to do nested parallelization of a binary tree DFS. In that example the left and right child are each explicitly surrounded by a #pragma omp task section. I tried to do something similar like
#pragma omp single
for(i=0;i<v;i++)
#pragma omp task if(ell == depth_of_parallelization) final(ell == depth_of_parallelization)
{
same code in here
}
but I am not getting all the tuples generated so I think I am doing something wrong with shared memory.
I have found suggestions of turning on omp_set_nested(1) but this had no effect.
In any case I only want to parallelize at one level of my DFS and I have not found any people discussing this specific issue. Any help is appreciated.
Aucun commentaire:
Enregistrer un commentaire