It is not 100% clear from your question, but I understand that you are looking for a faster algorithm to search for a certain number of the best / cheapest combinations of departure and arrival flights.
You can do this much faster by sorting the departure and arrival flight lists individually by cost, and then using the heap to expand the next best combinations one by one until you have enough.
Here the complete algorithm is in Python, but without using any special libraries, only standard data structures, so this should be easily ported to any other language:
NUM_FLIGHTS, NUM_BEST = 1000, 100
This is something like this:
- sort both
dep departure and arr flights by cost - create a heap and put the best combination (the best departure and the best arrival), as well as the corresponding indices in your lists in the heap:
(dep[0] + arr[0], 0, 0) - repeat until you have enough combinations or there are no more elements in the heap:
- pull the best item from the heap (sorted by total cost)
- if it meets the constraints, add it to the result set
- make sure you don't add flights to the result set twice using the
visited set - add two "adjacent" combinations to the heap, that is, taking the same flight from
dep , and the next from arr , and the next from dep and the same from arr , i.e. (dep[i+1] + arr[j], i+1, j) and (dep[i] + arr[j+1], i, j+1)
Here is a very small processed example. The axes are (expenses) flights dep and arr , and the entries in the table are presented in the form n(c)m , where n is the iteration into which the entry in heap was added (if at all), c is the cost, and m - iteration added to the list of 'top 10' result (if any).
dep\arr 1 3 4 6 7 2 0(3)1 1(5)4 4(6)8 8(8)- - 2 1(3)2 2(5)6 6(6)9 9(8)- - 3 2(4)3 3(6)7 7(7)- - - 4 3(5)5 5(7)- - - - 6 5(7)10 - - - - Result: (1,2), (1,2), (1,3), (3,2), (1,4), (3,2), (3,3), (2,4), (2,4), (1,6)
Notice how the sums in the columns and rows of the matrix always increase, so the best results can always be found in the somewhat triangular area in the upper left corner. Now the idea is that if your best combination (the first on the heap) is dep[i], arr[i] , then there is no need to check, for example, the combination dep[i+2], arr[i] before checking dep[i+1], arr[i] , which should have a lower overall cost, so you add dep[i+1], arr[i] (and similarly dep[i], arr[i+1] ) and repeat by selecting the next item from the heap.
I compared the results of this algorithm with the results of your approach with brute force, and the received flights coincide, i.e. The algorithm works and always gives an optimal result. The complexity must be O (n log (n)) for sorting the departure and arrival lists (n is the number of flights in these source lists), plus O (m log (m)) for the heap cycle (m iterations with log (m) work on iteration, m is the number of elements in the list of results).
It finds the best 1000 combinations of 100,000 departures and 100,000 arrival flights (totaling 1,000,000,000,000 possible combinations) in less than one second.
Please note that these numbers are intended for the case when you do not have additional restrictions, i.e. Each departure flight can be combined with each arrival flight. If there are limitations, you can use the is_compatible function drawn in the code above to test them and skip this pairing. This means that for each incompatible pair with a low total cost, the cycle requires another iteration. This means that in the worst case, for example, if there are no compatible pairs at all or when the only compatible pairs are those with the maximum total cost, the algorithm can actually expand all the combinations.
On average, however, this should not be, and the algorithm should run pretty quickly.