Select a pair of lines that obey the rule.

I have a large table (1M rows) with the following columns: source, dest, distance. Each line defines a link (A to B).

I need to find the distances between a pair using anoter node. Example: If you want to find the distance between A and B, If I find node x and have: x → A x → B I can add these distances and the distance between A and B. My question is: How to find all nodes (for example, x) and get their distances to (A and B)? My goal is to choose the minimum distance value.

Ps: A and B are just one connection (I need to do this for 100K connections). Thanks!

+6
source share
6 answers

As Andomar said, you will need Dijkstra's algorithm, here is a link to this algorithm in T-SQL: T-SQL Dijkstra Algorithm

+1
source

Assuming that you want to get the path from AB with many intermediate steps, this cannot be done in plain SQL for an undetermined number of steps. Simply put, he lacks expressive power, see http://en.wikipedia.org/wiki/Expressive_power#Expressive_power_in_database_theory . As Andomar said, load the data into the process and the Jikstra algorithm.

0
source

It sounds like a salesman problem .

In terms of SQL syntax: connect by previous will build a tree after using the beginning and limit the number of layers that it can cross; however, performance does not guarantee a minimum.

0
source

I can take this place, but I find this an interesting problem. I want this to be a more open discussion, as I think I could learn a lot.

It seems that this can be achieved by executing several selection statements - something like SELECT id FROM mytable WHERE source="A" ORDER BY distance ASC LIMIT 1 . After completing something like this in the while loop and replacing "A" with the id variable, it will do the trick, no?

For example (A - source, B - final destination):

 DECLARE var_id as INT WHILE var_id != 'B' BEGIN SELECT id INTO var_id FROM mytable WHERE source="A" ORDER BY distance ASC LIMIT 1 SELECT var_id END 

Is it something like this job? (The code is sloppy, but the idea seems sound.) Comments are more than welcome.

0
source

Attach the table to yourself using the destination connected to the source. Add distance from two links. Paste this as a new link with the left side source, right side and total distance, if not already indicated in the table. If it is in the table but with a shorter total distance, update the existing row with a shorter distance.

Repeat this until you get new links added to the table and no updates with a shorter distance. Your table now contains a link for each possible combination of source and destination with a minimum distance between them. It would be interesting to see how many repetitions are required.

This will not track the intermediate path between the source and destination, but only provides the shortest distance.

0
source

IIUC should do this, but I'm not sure if this is really viable (in performance) due to the large number of rows involved and for CROSS JOIN

 SELECT t1.src AS A, t1.dest AS x, t2.dest AS B, t1.distance + t2.distance AS total_distance FROM big_table AS t1 CROSS JOIN big_table AS t2 ON t1.dst = t2.src WHERE A = 'insert source (A) here' AND B = 'insert destination (B) here' ORDER BY total_distance ASC LIMIT 1 

The above snippet will work for the case when you have two lines in the form A-> x and x-> B, but not for other combinations (for example, A-> x and B-> x). Extending it to cover all four combinations should be trivial (for example, create a view that duplicates each row and swaps src and dest).

0
source

Source: https://habr.com/ru/post/899925/


All Articles