Good algorithm for finding the diameter of a (sparse) graph?

Question

Good algorithm for finding the diameter of a (sparse) graph?

I have a large, connected, sparse graph in the form of an adjacency list. I would like to find two peaks that are located as far as possible from each other, that is, the diameter of the graph and two peaks that achieve this.

I am interested in this problem both in non-oriented and in directed cases for different applications. In the directional case, of course, I care about the directional distance (the shortest directional path from one vertex to another).

Is there a better approach than computing the shortest paths of all pairs?

Edit: “as much as possible”, of course, I mean “the longest shortest path”, i.e. maximum over all pairs of vertices of the shortest distance from one to another.

+51

math algorithm graph-theory

A. Rex Jul 27 '09 at 20:54

source share

13 answers

agorenst · Answer 1 · 2009-07-27 23:53

Well, I thought a little about the problem and searched a bit in googling, and I'm really sorry, but I can’t find an algorithm that doesn’t seem to “just find all pairs the shortest path”.

However, if you assume that Floyd-Warshall is the only algorithm for computing such a thing (Big-Theta of | V | ^ 3), then I have some good news for you: Johnson's algorithm for sparse graphs (thanks, trusty CLRS! ) computes all pairs of shortest paths in (Big-Oh (| V | ^ 2 * logV + VE)), which should be asymptotically faster for sparse graphs.

Wikipedia says it works for directional (not sure about non-directional, but at least I can't think of a reason why not), here's the link.

Is there anything else about the schedule that might be useful? If it can be easily correlated with a 2D plane (therefore, its planar and weight coefficients obey the triangle inequality [more strict requirements may need to be fulfilled, I'm not sure]), you can pull out some geometric algorithms (the convex hull can work in nlogn, and find the farthest pair of points easily from there).

Hope this helps! - Agor

Edit: I hope the link works now. If not, just go to Google. :)

job · Answer 2 · 2009-07-28 15:49

I do not know a better method for calculating the diameter than all the shortest paths, but Mathematica uses the following approximation for PseudoDiameter:

A geodesic graph is the shortest path between two vertices of a graph. The diameter of the graph is the longest possible length of the entire graph of the geodetic graph. PseudoDiameter finds the approximate diameter of the graph. It works starting from the vertex u and finds the vertex v which is farthest from u. This process is repeated by processing v as a new initial vertex and ends when the distance to the graph no longer increases. The vertex from the last level that has the least degree is selected as the final initial vertex u, and the bypass to see if the distance of the graph can be increased. This plot distance is considered pseudo-diameter.

http://reference.wolfram.com/mathematica/GraphUtilities/ref/PseudoDiameter.html

Aaron McDaid · Answer 3 · 2009-07-27 22:29

Change I reset again, just to continue commenting. I have some comments on Johnson's algorithm below this answer. - Aaron

My original comment: I am also interested in this issue, but I have no answer. This seems to be related to the Minimum Spanning Tree , a subgraph connecting all the vertices but having the smallest (or lowest) edges. This is an old problem with a number of algorithms; some of which seem pretty easy to implement.

I initially hoped that the diameter would be apparent after detecting MST, but now I'm losing hope :-( Perhaps MST can be used to place a reasonable upper bound on the diameter, which you can use to speed up the search for the actual diameter?

Keith Randall · Answer 4 · 2010-01-04 18:42

Here are some thoughts on how best of all couples to find the shortest paths on an undirected chart, although I'm not sure how much better it would be.

There is a subroutine that will find two nodes at a distance D from each other, if any. Select an arbitrary node x and calculate M [x] = the maximum distance from x to any other node (using any algorithm with the shortest single source). If M [x]> = D, then x is one of our nodes, and the other is easy to find. However, if M [x] D, then neither the endpoint we are looking for can be less than the distance D - M [x] from x (because there are paths from this node to all other nodes through x, distances <D). Therefore, find all nodes of a distance shorter than DM [x] from x and mark them as bad. Select the new x, this time making sure that we remove all nodes marked as bad and repeat. We hope that we will mark many nodes as badly, so we will have to do much less than | V | shortest paths.

Now we just need to set D = diam (G) and perform the procedure described above. We don’t know what diam (G) is, but we can get a rather tight range for it, as for any x, M [x] <= diam (G) <= 2M [x]. Select a few x to start, calculate M [x] for each, and calculate the upper and lower bounds on diam (G). Then we can perform a binary search in the result range using the procedure described above to find the path to the estimated length, if any.

Of course, this is disoriented only. I think you could make a similar diagram with oriented graphs. Bad nodes are those that can reach x less than DM [x], and the upper bound on diam (G) does not work, so you need a wider binary search range.

GalacticJello · Answer 5 · 2010-01-13 17:12

Not sure if it matches the bill, but interesting:

HADI: quick diameter determination and mining in massive graphs with Hadoop

U. Kan, K. Tsurakakis, A. P. Appel, K. Falutos, J. Leskoviec, “HADI: Estimation of Fast Diameters and Production in Massive Graphs with Hadoop”, CMU ML Tech Report CMU-ML-08-117, 2008 .

peter · Answer 6 · 2010-12-05 06:53

if the graph is a tree (and not oriented). you can just run 2 dfs. Start with arbitrary node u and dfs to find the farthest node v. Then start with v and find the farthest length. This length is optimal.

MAK · Answer 7 · 2010-01-05 14:31

I really doubt that there is any method of finding the longest shortest path without using any shortest path algorithm for all pairs (finding the shortest path with one source several times basically does all pairs in the worst case).

The "diameter" becomes difficult to determine in terms of the "longest path" if the graph is not a tree or DAG. The "longest" path can be infinite if there are cycles on the chart. Therefore, a simple graph traversal cannot give the longest path to all nodes. Since you have already stated that your graph is not necessarily acyclic, and you are interested in the “longest shortest” path, there seems to be no method that can avoid finding the shortest path for all nodes. Using Johnson's algorithm, Agor suggested, is probably best suited for this.

Of course, you can use a heuristic based approach. The most commonly used approach is an algorithm that uses a pseudo-peripheral vertex .

lorenzog · Answer 8 · 2010-01-06 19:22

Forgive me if my answer is incorrect in terms of syntax, but my Algorithm course was a long time ago (and not in English).

If I understand your problem correctly, you want to find out what maximum number you can count, starting with node A and reaching node B without "re-checking" your steps. If so, I would suggest that your schedule is acyclic (a cyclic version will appear later).

First of all, the upper limit is the number of edges. As I see it: take one node, create a tree where the node is in the root, and each subsequent node that you can reach is at the next level. The height of the tree you are building is the diameter, and the leaves are the nodes that are at that distance. If this distance = number of edges, you are done. If not, select another node and repeat.

I think this is like building a breadth-first search. Without knowing more about the graph, you could use some heuristics to see which tree orientation (i.e. which node should be selected first) would be better, but this is another topic.

Regarding the cyclical nature of the chart, as others have indicated, this can lead to endless cycles. A way to get rid of them may be to “exclude” the nodes that belong to the cycle, and add the longest path between them as the value that you get by entering the cycle and exiting it, touch each node only once.

Now, as I said, this method can easily be the same as for the whole pair shortest path. Worst of all, the complexity of the case, of course, is the same and cannot be otherwise.

Justin W · Answer 9 · 2010-01-08 18:11

One way to get an estimate of this number is to start from an arbitrary point and execute the grass-fire algorithm with a width in width, indicating the shortest distance to each node. The longest distance here is your rating.

Executing this extremely fast algorithm several times with different starting points, and then with the maximum will increase the accuracy of the estimate and, of course, give you a decent lower bound. Depending on the distribution and connectivity of your schedule, this estimate may even be accurate!

If your graph is large enough, an asymptotic analysis of how the score changes by adding more samples may allow you to predict an even better guess.

If you are interested in the exact answer, it seems unlikely that you can avoid too sharp angles if your schedule is not easily broken down into components that are loosely coupled to each other - in which case you can limit yours, find the shortest path between all pairs of vertices in different components.

Karoly Horvath · Answer 10 · 2011-03-08 19:40

Select vertex v and do BFS (v), this will calculate the distance from v for all vertices. Get the longest distance. This is O (V + E)

Now run this algorithm for all v vertices and select the maximum of these longest distances. Total difficulty: O (V * (V + E))

JeffWilkinson · Answer 11 · 2019-04-30 21:59

You may not have to calculate ALL pairs, because the undirected graph has an upper bound available and it can move down.

When Breath-First-Search (BFS) is performed from an arbitrary node, it can list all other nodes sorted by distance. Of course, the longest distance is the lower limit of the diameter and the candidate for it.

The two upper distances stacked together are the upper limit of the diameter you are looking for. By accepting the top two, you can exclude any nodes for which you have already made BFS, since you already know the diameters that use these nodes as the endpoint. When assigning longer distance nodes to the following nodes performing BFS, the upper bound will eventually reach the lower bound. This can happen before you have made all the pairs.

Yin Zhu · Answer 12

Dirty method:

We know that for the graph G (V, E) with | V | = n and | E | = m Dijkstra's algorithm works in O(m+nlogn) , and this is for one source. For your pairing problem, you need to run Dijkstra for each node as a starting point.

However, if you have many machines, you can easily perform a parallel process.

This method is easiest to implement, definitely not very good.

zezba9000 · Answer 13 · 2010-12-05 10:21

Yes, there is a better way to find the diameter of the graph. Here I made a simple class to demonstrate this. The vertices will be points on your chart.

 public class MyTestClass { //Simple Point struct struct Vertex { public float X, Y; public Vertex(float pX, float pY) { X = pX; Y = pY; } } //For getting the bounds of your graph struct BoundingBox { public float Left, Right, Bottom, Top; public BoundingBox(float pLeft, float pRight, float pBottom, float pTop) { Left = pLeft; Right = pRight; Bottom = pBottom; Top = pTop; } } //Properties Vertex[] vertices; BoundingBox bound; float diameter; //Constructor //Here is the fastest way to get the diameter >> public MyTestClass() { //Init objects vertices = new Vertex[100]; for(int i = 0; i != vertices.Length; ++i) vertices[i] = new Vertex(i, i); bound = new BoundingBox(vertices[0].X, vertices[0].X, vertices[0].Y, vertices[0].Y); //Calculate BoundingBox for(int i = 0; i != vertices.Length; ++i) { bound.Left = (vertices[i].X <= bound.Left) ? vertices[i].X:bound.Left; bound.Right = (vertices[i].X >= bound.Right) ? vertices[i].X:bound.Right; bound.Bottom = (vertices[i].Y <= bound.Bottom) ? vertices[i].Y:bound.Bottom;//NOTE: If Y is faces down, then flip bottom & top comparison bound.Top = (vertices[i].Y >= bound.Top) ? vertices[i].Y:bound.Top; } //Messure Size of the BoundingBox float vecX = (bound.Right-bound.Left); float vecY = (bound.Top-bound.Bottom); diameter = (float)System.Math.Sqrt((vecX*vecX) + (vecY*vecY)); } }

Good algorithm for finding the diameter of a (sparse) graph?

More articles: