Search on the GPU for all possible paths between two nodes on the graph

In my work, the algorithm of Millor, Martorana, and Sciortino is widely used to find all possible simple paths, i.e. those in which no node occurs more than once in a graph, as described in: Algorithm for finding all the paths between two nodes in a graph . (Although this algorithm is essentially a depth search and intuitively recursive in nature, the authors also present a stack-based non-recursive implementation.) I would like to know if such an algorithm can be implemented on a GPU. At the moment, I am trying my best to see real parallelism in this problem. For example, the costs of monitoring and scheduling threads can make it impossible to search for a schedule together (by hardware threads). Alternatively, a split and conquest strategy may work if the graph is partitioned and assigned to individual hardware threads for search. However, one would have to figure out how to (1) break the schedule (2) into subtasks and (3) combine the search results into sections.

+4
source share
2 answers

A bit rusty on it. What about Dijkstra?

Boolean[] visited; // [node] = true; Boolean[][] connected; // [node][i] = node Vector<Vector<Integer>>[] path; // this should suck Integer startNode; Integer endNode; Queue queue0; //for thread 0 Queue queue1; //for thread 1 while (queue0.hasNext()) { Integer node = queue.getNext(); if visited[node] { continue; } else { visited[node] = true; } for (nextNode: connected[node]) { for (i) { path[nextNode].append(path[node][i].clone().append(node)); } if (nextNode%2 == 0) { queue0.add(nextNode); } if (nextNode%2 == 1) { queue1.add(nextNode); } } } 

path [endNode] [i] // i-th path to endNode from startNode

splitting: came from node% 2
subtasks: find a place to go from node
association: you have a common memory, right?

+2
source

I do not think that your problem can be easily ported to the GPU in such a way that it will work faster. GPUs that use more powerful GPUs:

  • It consists of thousands of threads, but their number is constant. There are no new flows or kill previous ones.
  • Prefers shared memory access. If neighboring threads access completely different memory areas (and, as a rule, graph algorithms), they will be slow.
  • I don't like recursion and stacks. The new NVIDIA Fermi cards support function calls, and threads may have a stack, but due to the large number of threads, the stacks are very short (or consume a lot of memory).

I am not saying that there is no efficient GPU algorithm, but I believe that there is no easy way to convert existing algorithms to efficient code.

0
source

Source: https://habr.com/ru/post/1334311/


All Articles