How to assign random values ​​in an array of n elements that satisfy the following restrictions?

Suppose a is my array of n elements.

Limitations:

  • alpha <= a [i] <= beta.
  • 0 <= alpha = beta <= 1.
  • The sum of all elements in must be exactly 1.

Assuming such an array is always possible for a given alpha and beta value

That's what I think:

Randomly assign values ​​between alpha and beta. Find the sum of the elements of the array. If sum> 1, subtract all elements with x so that the constraints are met. If the sum is <1, add all elements with x so that the constraints are met.

But it's hard for me to find the new displayed values.

+5
source share
3 answers

This can be done if Ξ± Γ— n ≀ 1 ≀ beta Γ— n.

You can do this by selecting n random numbers. Let r i be a random number and th R the sum of random numbers and S = (beta-alpha) / (1-n Γ— alpha) Γ— R. Set a [i] = alpha + r i / S * (beta - alpha). As long as S is not less than the maximum random number, then all elements of a are between alpha and beta, and their sum is 1.

 #include <iostream> #include <cassert> #include <random> #include <vector> #include <algorithm> #include <iterator> int main() { const double alpha = 0.05, beta = 0.15; const int n = 10; if (!(n * alpha <= 1.0 && 1.0 <= n * beta)) { std::cout << "unable to satisfy costraints.\n"; return 0; } if (alpha == beta || std::fabs(1.0 - n * alpha) < 1e-6 || std::fabs(1.0 - n * beta) < 1e-6) { std::cout << "Solution for alpha = " << alpha << ", beta = " << beta << " n = " << n << ":\n"; std::fill_n(std::ostream_iterator<double>(std::cout, " "), n, 1.0/n); std::cout << "\nSum: 1.0\n"; return 0; } std::vector<int> r; double S; { std::random_device rd; std::seed_seq seed{rd(), rd(), rd(), rd(), rd(), rd(), rd(), rd()}; std::mt19937 eng(seed); std::uniform_int_distribution<> dist(0, 1000); do { r.clear(); std::generate_n(back_inserter(r), n, std::bind(dist, std::ref(eng))); int sum = std::accumulate(begin(r), end(r), 0); S = (beta - alpha) / (1 - n * alpha) * sum; } while (S < *std::max_element(begin(r), end(r))); } std::vector<double> a(n); std::transform(begin(r), end(r), begin(a), [&] (int ri) { return alpha + ri/S * (beta - alpha); }); for (double ai : a) { assert(alpha <= ai && ai <= beta); } std::cout << "Solution for alpha = " << alpha << ", beta = " << beta << " n = " << n << ":\n"; std::copy(begin(a), end(a), std::ostream_iterator<double>(std::cout, " ")); std::cout << '\n'; std::cout << "Sum: " << std::accumulate(begin(a), end(a), 0.0) << '\n'; } 

Solution for alpha = 0.05, beta = 0.15, n = 10:
0.073923 0.117644 0.0834555 0.139368 0.101696 0.0846471 0.115261 0.0759395 0.0918882 0.116178
Amount: 1


You did not specify a specific distribution that you would like to provide to the algorithm, but I thought I wanted to note that this algorithm does not necessarily give all possible solutions with equal probability; I believe that some decisions are more likely to be made than others.

+2
source

(As I mentioned in the commentary, there is a hidden requirement. If you look at the range of numbers, there must be at least one element <= (1/n) , otherwise the sum will be > 1 force alpha <= 1/n . Similarly, we need to force beta >= 1/n .)

We could do the following:

  • Let x_i = 0 for i = 0 .
  • Select a random number in [x_i, C] . Let it be x_(i+1) . Below we describe this constant C
  • Follow the above n-1 times to get {x_1, x_2, ...., x_(n-1)} .
  • Now the numbers {(x_1 - x_0), ..., (1 - x_(n-1))} positive and sum up to C Let them be {y_1, y_2, ..., y_n} .
  • Now consider k_i = alpha + y_i * (beta - alpha) . Since y_i is in (0, C) , k_i will also be between alpha and beta if C <= 1 .
  • The summation of k_i is n * alpha + (beta - alpha) * (sum of y_i) . This is the same as n * alpha + (beta - alpha) * C This should be equal to 1 .

Solving for C , we obtain C = (1 - n * alpha) / (beta - alpha) .

To illustrate: if alpha = 0.4 , beta = 1 , n = 2 , we get: C = (1 - 0.8) / 0.6 = 1/3

The random values ​​for x_i are, say: {1/8} (since we only care until (n-1) = 1 )

y_i = {1/8, (1/3 - 1/8)} = {1/8, 5/24}

beta - alpha = 0.6

k_i = {0.4 + 0.6/8, 0.4 + 3/24} = {0.475, 0.525}

Therefore, the required array is {0.475 and 0.525} , which is added to 1 .

The main assumption here is that C < 1 . I think we can prove that C < 1 is a strong requirement for the feasibility of the problem, but now it's too late here to prove it.

Note that the above gives uniformly random numbers, since the numbers x_i are randomly selected in the range (0, C) .

+2
source

I realized that the problem itself is not very simple, but I found a simple algorithm that I think (not yet proven) will generate a uniform distribution over all possible values. I think that this algorithm must have been considered somewhere, since it is quite simple, but I did not try to find an analysis of this algorithm. The algorithm is as follows:

  • Iteration from 1 to n
  • At each step, find the real lower and upper bounds, assuming that the values ​​after this point will have a maximum value (for the lower) or a minimum value (for the upper range).
  • Generate a uniformly distributed number between the boundaries (if there is no number satisfying the estimate, then the restriction is unsatisfactory)
  • Add generated number to array
  • At the end, shuffle the resulting array

code:

 # -*- coding: utf-8 -*- """ To produce an array of random numbers between alpha and beta, which sums to 1 Assumption: there is always a solution for the given constraint. """ # Import statements import random import sys def shuffle(arr): result = [] while len(arr) > 0: idx = random.randint(0, len(arr)-1) result.append(arr[idx]) del(arr[idx]) return result def main(): if len(sys.argv) < 4: print('Usage: random_constrained.py <n> <alpha> <beta>\nWhere:\n\tn is an integer\n\t0 <= alpha <= beta <= 1') sys.exit(0) n = int(sys.argv[1]) alpha = float(sys.argv[2]) beta = float(sys.argv[3]) result = [] total = 0 for i in range(n): low = max(alpha, 1-total-((ni-1)*beta)) high = min(beta, 1-total-((ni-1)*alpha)) if high < low - 1e-10: print('Error: constraint not satisfiable (high={:.3f}, low={:.3f})'.format(high, low)) sys.exit(1) num = random.uniform(low, high) result.append(num) total += num result = shuffle(result) print(result) if __name__ == '__main__': main() 

Selective Outputs:

  $ python random_constrained.py 4 0 0.5
 [0.06852504971359885, 0.39391285249108765, 0.24215492185626314, 0.2954071759390503]
 $ python random_constrained.py 4 0 0.5
 [0.2519926400714304, 0.4138640296394964, 0.27906367876610466, 0.055079651522968565]
 $ python random_constrained.py 4 0 0.5
 [0.11505150404455633, 0.16665881845206237, 0.45371668123772924, 0.264572996265652]
 $ python random_constrained.py 4 0 0.5
 [0.31689744182294444, 0.11233051635974067, 0.3599600067081529, 0.21081203510916197]
 $ python random_constrained.py 4 0 0.5
 [0.16158825078700828, 0.18989326608974527, 0.1782112102703714, 0.470307272852875]

 $ python random_constrained.py 5 0 0.2
 [0.19999999999999998, 0.2, 0.19999999999999996, 0.19999999999999996, 0.20000000000000004]
 $ python random_constrained.py 5 0 0.2
 [0.2, 0.2, 0.19999999999999998, 0.2, 0.19999999999999996]
 $ python random_constrained.py 5 0 0.2
 [0.20000000000000004, 0.19999999999999998, 0.19999999999999996, 0.19999999999999998, 0.2]
 $ python random_constrained.py 5 0 0.2
 [0.2, 0.20000000000000004, 0.19999999999999996, 0.19999999999999996, 0.19999999999999996]

 $ python random_constrained.py 2 0.4 1
 [0.5254259945319483, 0.47457400546805173]
 $ python random_constrained.py 2 0.4 1
 [0.5071103628251259, 0.4928896371748741]
 $ python random_constrained.py 2 0.4 1
 [0.4595236988530377, 0.5404763011469623]
 $ python random_constrained.py 2 0.4 1
 [0.44218002983240046, 0.5578199701675995]
 $ python random_constrained.py 2 0.4 1
 [0.4330169754142243, 0.5669830245857757]
 $ python random_constrained.py 2 0.4 1
 [0.543183373724851, 0.45681662627514896]

I'm still not sure how to deal with the error of accuracy.

I checked some tests to verify uniformity in the case of n=2 , generating 100,000 arrays (with alpha=0.4, beta=0.6 ) and getting values ​​in 10 buckets of equal size from alpha to beta, counting the number of occurrences:

  First number: [9998, 9966, 9938, 9952, 10038, 10161, 9899, ​​10007, 10054, 9987]
 Second number: [9987, 10054, 10007, 9899, ​​10161, 10038, 9952, 9938, 9966, 9998]

For n=4, alpha=0, beta=0.3 10000 attempts:

  [0, 0, 0, 304, 430, 569, 730, 1135, 1874, 4958]
 [0, 0, 0, 285, 492, 576, 805, 1113, 1775, 4954]
 [0, 0, 0, 248, 465, 578, 769, 1077, 1839, 5024]
 [0, 0, 0, 252, 474, 564, 800, 1100, 1808, 5002]

We see that each number has a more or less the same distribution, so there is no bias to any position.

+1
source

Source: https://habr.com/ru/post/1235273/


All Articles