Algorithm for generating a numerical matrix with given boundaries

I am trying to create a matrix of numbers with 7 rows and 4 columns. Each row should be 100, and each column should have a uniform spread (if allowed) between the range of min and max (indicated below).

Purpose:

       C1      C2    C3    C4   sum   range 
 1     low                      100    ^
 2     ..                              |  
 3     ..                              |
 4     ..                              | 
 5     ..                              |
 6     ..                              |
 7     high                            _

c1_high = 98
c1_low = 75
c2_high = 15
c2_low = 6
c3_high = 8
c3_low = 2
c4_low = 0.05
c4_high =0.5

In addition to this, I need each line to be as linear as possible, although for data with a second-order polynomial it would be sufficient (for r ^ 2> 0.98).

I am currently trying to do this using the following sudocode:

  • generates a random number between ranges for c1, c2, c3 and c4.
  • repeat this 7 times
  • check the correlation between each generated value c1 and the range of numbers from 1 to 7. For example:

enter image description here

  1. repeat step 3 for c2, c3 and c4.

  2. Breakthrough cycle if steps 3 and 4 are successful

, , .

?

:

import pandas as pd
import numpy as np
from sklearn.utils import shuffle

c1_high = 98
c1_low = 75
c2_high = 15
c2_low = 6
c3_high = 8
c3_low = 2
c4_low = 0.05
c4_high =0.5

def matrix_gen(): #generates matrix within min and max values
    container =[]
    d={}
    offset = np.linspace(0.05,1,9)
    c1= np.linspace(c1_low, c1_high, 7)
    c2= np.linspace(c2_low, c2_high, 7)
    c3= np.linspace(c3_low, c3_high, 7)
    c4= np.linspace(c4_low, c4_high, 7)

    for i in np.arange(7):
        d["row{0}".format(i)]=[item[i] for item in [c1,c2,c3,c4]]

    df =pd.DataFrame(d)
    df.loc[4,:] = df.iloc[0,:][::-1].values
    df1 = df.drop(0)
    df1.loc[5,:] = df1.sum(axis=0)
    new_name = df1.index[-1]
    df1 = df1.rename(index={new_name: 'sum'})
    return df1

m = matrix_gen()
print(m)

out:

       row0        row1        row2     row3        row4       row5  row6
1      6.00    7.500000    9.000000   10.500   12.000000  13.500000  15.0
2      2.00    3.000000    4.000000    5.000    6.000000   7.000000   8.0
3      0.05    0.125000    0.200000    0.275    0.350000   0.425000   0.5
4     98.00   94.166667   90.333333   86.500   82.666667  78.833333  75.0
sum  106.05  104.791667  103.533333  102.275  101.016667  99.758333  98.5

:

def shuf():  # attempts at shuffling the values around such that the 'sum' row is as close to 100 as possible. 
    df = matrix_gen()
    df1 = df[1:4]
    count =0
    while True:
        df1 = shuffle(df1)
        df1.loc[5,:] = df1.sum(axis=0)
        for i in df1.loc[5].values:
            if 98<= i <=100:
                print('solution')
                return df1
            else:
                count+=1
                print(count)
                continue

opt = shuf()
print(opt)

, , 100. .

+4
2

.

"" (linspace(low, high, 7)). "abserr" "relerr" , . corrcoefs , 99.8%

. , :

  • 4 .
  • 7! , ( )
  • (7!) ^ 2
  • , , , , "abserr" "relerr"

, 100. , .

: solve, relerr improved_solve. , 100 improved_solve.

:

OP:

                ((75, 98), (6, 15), (2, 8), (0.05, 0.5))

solve relerr                            improved_solve relerr                  
table:                                  table:                                 
76.14213 15.22843  8.12183  0.50761     76.14213 15.22843  8.12183  0.50761    
79.02431 13.53270  7.01696  0.42603     79.02431 13.53270  7.01696  0.42603    
81.83468 11.87923  5.93961  0.34648     81.83468 11.87923  5.93961  0.34648    
84.57590 10.26644  4.88878  0.26888     84.57590 10.26644  4.88878  0.26888    
87.25048  8.69285  3.86349  0.19317     87.25048  8.69285  3.86349  0.19317    
89.86083  7.15706  2.86282  0.11928     89.86083  7.15706  2.86282  0.11928    
92.40924  5.65771  1.88590  0.04715     92.40924  5.65771  1.88590  0.04715    
avgerr:                                 avgerr:                                
 0.03239                                 0.03239                               
corrcoefs:                              corrcoefs:                             
 0.99977  0.99977  0.99977  0.99977      0.99977  0.99977  0.99977  0.99977 

, , , :

                ((11, 41), (4, 34), (37, 49), (0.01, 23.99))

, , .

solve relerr                            improved_solve relerr                  
table:                                  table:                                 
10.89217 18.81374 46.53926 23.75483     11.00037 24.00080 49.00163 15.99720    
26.00087  9.00030 49.00163 15.99720     16.00107 19.00127 45.00300 19.99467    
31.00207  4.00027 45.00300 19.99467     25.74512 13.86276 36.63729 23.75483    
16.00000 29.00000 43.00000 12.00000     35.99880  8.99970 46.99843  8.00307    
20.99860 33.99773 40.99727  4.00640     41.00000  4.00000 43.00000 12.00000    
40.99863 13.99953 36.99877  8.00307     20.99860 33.99773 40.99727  4.00640    
36.35996 24.23998 39.38996  0.01010     31.30997 29.28997 39.38996  0.01010    
avgerr:                                 avgerr:                                
 0.00529                                 0.00529                               
corrcoefs:                              corrcoefs:                             
 0.99993  0.99994  0.99876  0.99997      0.99989  0.99994  0.99877  0.99997 

, improved_solve solve:

                ((36.787862883725872, 43.967159949544317),
                 (40.522239654303483, 47.625869880574164),
                 (19.760537036548321, 49.183056694462799),
                 (45.701873101046154, 48.051424087501672))

solve relerr                            improved_solve relerr                  
table:                                  table:                                 
21.36407 23.53276 28.56241 26.54076     20.25226 26.21874 27.07599 26.45301    
22.33545 24.52391 26.03695 27.10370     21.53733 26.33278 25.10656 27.02333    
23.33149 25.54022 23.44736 27.68093     22.90176 26.45386 23.01550 27.62888    
24.35314 26.58266 20.79119 28.27301     24.35314 26.58266 20.79119 28.27301    
25.40141 27.65226 18.06583 28.88050     25.90005 26.71994 18.42047 28.95953    
26.47734 28.75009 15.26854 29.50403     27.55225 26.86656 15.88840 29.69279    
27.58205 29.87728 12.39644 30.14424     29.32086 27.02351 13.17793 30.47771    
avgerr:                                 avgerr:                                
 0.39677                                 0.39630                               
corrcoefs:                              corrcoefs:                             
 0.99975  0.99975  0.99975  0.99975      0.99847  0.99847  0.99847  0.99847 

:

import numpy as np
import itertools
import math

N_CHUNKS = 3

def improved_solve(LH, errtype='relerr'):
    N = math.factorial(7)
    # accept anything that looks like a 2d array
    LH = np.asanyarray(LH)
    # build equidistant columns
    C = np.array([np.linspace(l, h, 7) for l, h in LH])
    # subtract offset; it cheaper now than later
    c0, c1, c2, c3 = C - 25
    # list all permutiations of a single column
    p = np.array(list(itertools.permutations(range(7))))
    # split into left and right halves, compute all relative permutiations
    # and sort them by their sums of corresponding elements.
    # Left pairs in ascending, right pairs in descending order.
    L = np.sort(c0 + c1[p], axis=1)
    R = np.sort(c2 + c3[p], axis=1)[:, ::-1]
    # For each pair of permutations l in L, r in R compute the smallest
    # possible error (sum of squared deviations.)
    if errtype == 'relerr':
        err = np.empty((N, N))
        split = np.linspace(0, N, N_CHUNKS+1, dtype=int)[1:-1]
        for LCH, ECH in zip(np.split(L, split, axis=0),
                            np.split(err, split, axis=0)):
            dev = LCH[:, None] + R[None, :]
            ((dev / (100+dev))**2).sum(axis=-1, out=ECH)
        del dev
    elif errtype == 'abserr':
        err = (np.add.outer(np.einsum('ij,ij->i', L, L),
                            np.einsum('ij,ij->i', R, R))
               + np.einsum('ik, jk->ij', 2*L, R))
    else:
        raise ValueError
    # find pair of pairs with smallest error
    i = np.argmin(err.ravel())
    i1, i3 = np.unravel_index(i, (N, N))
    # recreate shuffled table
    c0, c1, c2, c3 = C        
    lidx = np.argsort(c0 + c1[p[i1]])
    ridx = np.argsort(c2 + c3[p[i3]])[::-1]
    C = np.array([c0[lidx], c1[p[i1]][lidx], c2[ridx], c3[p[i3]][ridx]])
    # correct rowsums, calculate error and corrcoef and return
    if errtype == 'relerr':
        result = C * (100.0 / C.sum(axis=0, keepdims=True))
        err = math.sqrt((((result-C)/C)**2).mean())
    else:
        result = C + (25 - C.mean(axis=0, keepdims=True))
        err = math.sqrt(((result-C)**2).mean())
    rs = np.sort(result, axis=1)
    cc = tuple(np.corrcoef(ri, range(7))[0, 1] for ri in rs)
    return dict(table=result.T, avgerr=err, corrcoefs=cc)

def solve(LH, errtype='relerr'):
    LH = np.asanyarray(LH)
    if errtype=='relerr':
        err1 = 200 / LH.sum()
        diff = np.diff(LH * err1, axis=1).ravel()
    elif errtype=='abserr':
        err1 = 25 - LH.mean()
        diff = np.diff(LH, axis=1).ravel()
    else:
        raise ValueError
    C = np.array([np.linspace(-d/2, d/2, 7) for d in diff])
    c0, c1, c2, c3 = C
    p = np.array(list(itertools.permutations(range(7))))
    L = np.sort(c0 + c1[p], axis=1)
    R = np.sort(c2 + c3[p], axis=1)[:, ::-1]
    err = (np.add.outer(np.einsum('ij,ij->i', L, L),
                        np.einsum('ij,ij->i', R, R))
           + np.einsum('ik, jk->ij', 2*L, R)).ravel()
    i = np.argmin(err)
    i1, i3 = np.unravel_index(i, (math.factorial(7), math.factorial(7)))
    L = np.argsort(c0 + c1[p[i1]])
    R = np.argsort(c2 + c3[p[i3]])[::-1]
    ref = [np.linspace(l, h, 7) for l, h in LH]
    if errtype=='relerr':
        c0, c1, c2, c3 = [np.linspace(l, h, 7) for l, h in LH * err1]
        C = np.array([c0[L], c1[p[i1]][L], c2[R], c3[p[i3]][R]])
        err2 = 100 / np.sum(C, axis=0)
        C *= err2
        cs = list(map(sorted, C))
        err = math.sqrt(sum((c/r-1)**2 for ci, ri in zip(cs, ref) for c, r in zip(ci, ri)) / 28)
    elif errtype=='abserr':
        c0, c1, c2, c3 = [np.linspace(l, h, 7) for l, h in LH + err1]
        C = np.array([c0[L], c1[p[i1]][L], c2[R], c3[p[i3]][R]])
        err2 = 25 - np.mean(C, axis=0)
        C += err2
        cs = list(map(sorted, C))
        err = math.sqrt(sum((c-r)**2 for ci, ri in zip(cs, ref) for c, r in zip(ci, ri)) / 28)
    else:
        raise ValueError
    cc = tuple(np.corrcoef(ci, range(7))[0, 1] for ci in cs)
    return dict(table=C.T, avgerr=err, corrcoefs=cc)

for problem in [((75, 98), (6, 15), (2, 8), (0.05, 0.5)),
                ((11, 41), (4, 34), (37, 49), (0.01, 23.99)),
                ((80, 94), (5, 14), (0.5, 5), (0.05, 0.5)),
                ((36.787862883725872, 43.967159949544317),
                 (40.522239654303483, 47.625869880574164),
                 (19.760537036548321, 49.183056694462799),
                 (45.701873101046154, 48.051424087501672))]:
    for errtype in ('relerr', 'abserr'):
        print()
        columns = []
        for solver in (solve, improved_solve):
            sol = solver(problem, errtype)
            column = [[' '.join((solver.__name__, errtype))]] + \
                     [[k + ':'] + [' '.join([f'{e:8.5f}' for e in r])
                                   for r in np.atleast_2d(v)]
                      for k, v in sol.items()]
            column = (line for block in column for line in block)
            columns.append(column)
        for l, r in zip(*columns):
            print(f"{l:39s} {r:39s}")

problems = []
for i in range(0):
    problem = np.sort(np.random.random((4, 2)), axis=1) * 50
    for errtype in ('relerr', 'abserr'):
        sol0 = solve(problem, errtype)
        sol1 = improved_solve(problem, errtype)
        if not np.allclose(sol0['table'], sol1['table']):
            print(i, end= " ")
            if np.abs((sol0['avgerr']-sol1['avgerr'])
                      /(sol0['avgerr']+sol1['avgerr']))>1e-6:
                print(problem)
                problems.append(problem)
                columns = []
                for sol, name in [(sol0, 'old '), (sol1, 'improved ')]:
                    column = [[name + errtype]] + \
                             [[k + ':'] + [' '.join([f'{e:8.5f}' for e in r])
                                           for r in np.atleast_2d(v)]
                              for k, v in sol.items()]
                    column = (line for block in column for line in block)
                    columns.append(column)
                for l, r in zip(*columns):
                    print(f"{l:39s} {r:39s}")
+1

, .

x (i, j) - , . :

sum(j, x(i,j)) = 100   ∀i
L(j) ≤ x(i,j) ≤ U(j)   ∀i,j
x(i,j) = x(i-1,j) + step(j) + deviation(i,j)
   special cases:
     x(1,j) = L(j) + deviation(1,j)
     and x(m,j) = U(j) + deviation(m,j)
step(j) ≥ 0
minimize sum((i,j), deviation(i,j)^2 )

. . LP.

.

, ( , ).

. , . . .

:

----     17 PARAMETER LO  

c1 80.000,    c2  5.000,    c3  0.500,    c4  0.050


----     17 PARAMETER UP  

c1 94.000,    c2 14.000,    c3  5.000,    c4  0.500

. , . LO UP .

:

(1) , . . :

----     53 PARAMETER init  initial matrix

            c1          c2          c3          c4      rowsum

r1      80.000       5.000       0.500       0.050      85.550
r2      82.333       6.500       1.250       0.125      90.208
r3      84.667       8.000       2.000       0.200      94.867
r4      87.000       9.500       2.750       0.275      99.525
r5      89.333      11.000       3.500       0.350     104.183
r6      91.667      12.500       4.250       0.425     108.842
r7      94.000      14.000       5.000       0.500     113.500

.. lo(j) up(j) .

(2) - , . :

----     53 VARIABLE y.L  after permutation

            c1          c2          c3          c4      rowsum

r1      94.000       5.000       0.500       0.125      99.625
r2      82.333      12.500       4.250       0.500      99.583
r3      89.333       8.000       2.000       0.200      99.533
r4      87.000       9.500       2.750       0.275      99.525
r5      84.667      11.000       3.500       0.350      99.517
r6      91.667       6.500       1.250       0.050      99.467
r7      80.000      14.000       5.000       0.425      99.425

"" .

(3) , , 100. . :

----     53 VARIABLE x.L  final values

            c1          c2          c3          c4      rowsum

r1      94.374       5.001       0.500       0.125     100.000
r2      82.747      12.503       4.250       0.500     100.000
r3      89.796       8.004       2.000       0.200     100.000
r4      87.469       9.506       2.750       0.275     100.000
r5      85.142      11.007       3.501       0.350     100.000
r6      92.189       6.510       1.251       0.050     100.000
r7      80.561      14.012       5.002       0.425     100.000


----     53 VARIABLE d.L  deviations

            c1          c2          c3          c4

r1       0.374       0.001 1.459087E-5 1.459087E-7
r2       0.414       0.003 9.542419E-5 9.542419E-7
r3       0.462       0.004 2.579521E-4 2.579521E-6
r4       0.469       0.006 4.685327E-4 4.685327E-6
r5       0.475       0.007 7.297223E-4 7.297223E-6
r6       0.522       0.010       0.001 1.123123E-5
r7       0.561       0.012       0.002 1.587126E-5

(2) (3) : .

:

enter image description here

, Cplex Gurobi.

, (, , ). P ( ). MIQP ( ). : . MIP. . , Python.

. , , init(i,j) , init. y(i,j) , .

+3

Source: https://habr.com/ru/post/1690552/


All Articles