How to speed up the distributed graph algorithm?

I have a problem with the graph as follows, and I need to optimize its execution performance. I'm just looking for a programming technique, not algorithmic optimization to improve performance. The task is as follows: for a graph G (V, E), each node u builds a subset of its neighbors N (u), called a multiset relay (M_r (u)), such that each neighbor of 2 hop u is equal to the neighboring at least one node in M_r (u). The construction of M_r (u) with node u is as follows.

construct_mr(u) 1) M_r(u) is initially empty. 1') The set of non-covered 2-hop of neighbors of u is the set of all 2-hop neighbors of u. // a covered 2-hop neighbor of u: is a 2-hop neighbor of u that is also a neighbor to at least one of the nodes of M_r(u). 2) while (M _r(u) is not a multiset relay set) 2a) update the set of non-covered 2-hop neighbors of u. 2b) add to M_r(u) a neighbor v that cover the most non-covered 2-hop neighbors of u. 

Now I have done the following:

 for each node u \in V: construct_mr(u). 

The problem here is that it is a serialized implementation and has terrible runtimes when the graph is large and dense. I am looking for a method that speeds up the execution of such an algorithm - it is advisable to use java or C ++. I, although multithreading, do I have to play with thread scheduling to get good performance? [Note that message passing programming models will not have any effect since we are not transmitting any message]

+4
source share
1 answer

I doubt that a distributed implementation will help.

The problem is based on the graph traversal process in which you identify each of the 2-way neighbors. To do this, each instance in a distributed version of the algorithm needs a copy of the graph:

  • In the latter case, each instance needs the entire graph. You will probably find that the time taken to distribute the graph to instances exceeds any acceleration of the purely parallel phase.

  • An alternative is to split the graph and send each instance of a portion of the graph. But then you presented the problem that the instance must talk to another instance to find out about the nodes of the graph that are not part of its graph, and this will probably lead to a denial of parallelism.

(Generally, distributed solutions work best if you don't need to move a lot of data between instances, and if calculations can be pretty much done by instances without talking to others. The cost of communication and synchronization between instances tends to kill parallelism.)

Not. If you want to parallelize this, it is best to use multithreading in one JVM. To get the maximum performance for this implementation of the algorithm, set the number of threads to the same as the number of available processors. Creating more threads than this will make the situation worse, not better. (And not bothering with thread schedules ... this will not help.)


Having said that, I suspect that the real reason for slow computation lies in the details of the algorithm and how data structures are implemented.

There is no magic "programming technique" that can quickly make a slow algorithm ... without change.

(One approach that works is to profile your code, determine where most of the time is spent, and ... after deep thinking about it ... redesign the code / data structure to improve it. no magic. Just hard work and thinking.)

+1
source

Source: https://habr.com/ru/post/1399594/


All Articles