Uch. This algorithm is incorrect; there is no evidence because it is wrong, and therefore it is impossible to prove that it is right. ;) I leave it here because I'm too attached to completely remove it, and this is a good demonstration of why you should formally prove the algorithms instead of saying "it looks right! It is impossible that it does not work!"
I give this decision without evidence, yet. I have a sketch of evidence, but I am having problems proving the optimal basis for this problem. Anyway...
Problem
Given a square array of numbers, select as many "non-overlapping" numbers as possible so that the sum of the selected numbers is maximized. "Non-overlapping" means that no two numbers can be from the same row or the same column.
Algorithm
- Let
A
be a square array of n by n
numbers. - Let
Aij
denote the element A
in the row i
th and j
th. - Let
S( i1:i2, j1:j2 )
denote the optimal sum of nonoverlapping numbers for a square subarray A
containing the intersection of rows i1
- i2
and columns j1
- j2
.
Then the optimal sum of disjoint numbers is denoted by S( 1:n , 1:n )
and is defined as follows:
S( 1:n , 1:n ) = max { [ S( 2:n , 2:n ) + A11 ] [ S( 2:n , 1:n-1 ) + A1n ] [ S( 1:n-1 , 2:n ) + An1 ] [ S( 1:n-1 , 1:n-1 ) + Ann ] } (recursively) Note that S( i:i, j:j ) is simply Aij.
That is, the optimal sum for a square array of size n
can be determined by calculating the optimal sum for each of the four submatrices of size n-1
, and then maximizing the sum of the sub-array and the element that was “left”.
S for |# # # #| |# # # #| |# # # #| |# # # #| Is the best of the sums S for: |# | | #| |# # # | | # # #| | # # #| |# # # | |# # # | | # # #| | # # #| |# # # | |# # # | | # # #| | # # #| |# # # | | #| |# |
Implementation
The recursive algorithm above offers a recursive solution:
def S(A,i1,i2,j1,j2): if (i1 == i2) and (j1==j2): return A[i1][j1] else: return max ( S( A, i1+1, i2, j1+1, j2) + A[i1][j1] ], S( A, i1+1, i2, j1, j2-1) + A[i1][j2] ], S( A, i1, i2-1, j1+1, j2) + A[i2][j1] ], S( A, i1, i2-1, j1, j2-1) + A[i2][j2] ], )
Note that this will call O(4^n)
on S()
!! This is much better than the factor O(n!)
Time complexity of the brute force solution, but still terrible performance.
It is important to note that many calls are repeated with the same parameters. For example, when solving a 3 * 3 array, each 2 * 2 matrix is solved many times.
This suggests two possible solutions to speed up:
- Make the resulting function
S()
cache results, so that for each i1,i2,j1,j2
only S(A,i1,i2,j1,j2)
is needed once. This means that S()
only needs to calculate the results O(n^3)
- all other requests will be filled from the cache. (This is called memoising .) - Instead of starting at the top with a large array of
n*n
and working with smaller subtasks, start from the bottom with the smallest possible subtasks and create up in n*n
business. This is called dynamic programming. It is also O(n^3)
, but it is much faster than O(n^3)
because you do not need to hit the cache all the time.
The dynamic programming solution has the following form:
- Find optimal solutions for all 1x1 submatrices. (Trivial.)
- Find optimal solutions for all 2x2 subarrays.
- Find optimal solutions for all 3x3 submatrices.
- ...
- Find optimal solutions for all n-1 * n-1 sub-arrays.
- Find optimal solutions for a complete n * n array.
Notes to this decision:
- There is no evidence yet. I'm working on it.
- You will notice that the above algorithm gives you
S()
, the optimal amount . He does not tell you what figures really make up this amount. You can add your own method of returning your path to the solution. - The algorithm above does not guarantee that a property related as
2,2
vs. 1,3
will be violated in favor of all individual numbers being as large as possible (so 2,2
wins). you can define max()
to break ties in favor of the largest possible numbers, and this will do what you want, but I cannot prove it.
General notes:
Dynamic programming is a powerful method for developing fast algorithms for any problem that exhibits two properties:
- Optimal substructure: the problem can be divided into several smaller parts, each of which can be used as part of the solution to the original problem.
- Overlapping subtasks mean that there are several actual subtasks that need to be solved, and solutions for subtasks are reused repeatedly.
If the problem has an optimal substructure, and the problem is divided into slightly smaller tasks - say, a problem of size n
divided into sub-tasks of size n-1
- then the problem can be solved by dynamic programming.
If you can divide the problem into much smaller chunks - say, chop a problem of size n
in half, each of sizes n/2
- which divide and win, not dynamic programming. Splitting and resolving solutions is usually very fast - for example, a binary search will find an element in a sorted array in O(log n)
time.