Pandas Random Weighted Choice

I would like to randomly select a value based on weighting with Pandas.

df:

   0  1  2  3  4  5
0  40  5 20 10 35 25
1  24  3 12  6 21 15
2  72  9 36 18 63 45
3  8   1  4  2  7 5
4  16  2  8  4 14 10
5  48  6 24 12 42 30

I know the use np.random.choice, for example:

x = np.random.choice(
  ['0-0','0-1',etc.], 
  1,
  p=[0.4,0.24 etc.]
)

So, I would like to get the output in a similar style / alternative method np.random.choicefrom df, but using Pandas. I would like to do this in a more efficient way compared to manually setting the values, as I did above.

Usage np.random.choiceI know that all values ​​should contain 1. I'm not sure how to solve this, or randomly select a value based on weights using Pandas.

, , , 40, 0-0, column 0, row 0 ..

+4
1

DataFrame:

stacked = df.stack()

( 1):

weights = stacked / stacked.sum()
# As GeoMatt22 pointed out, this part is not necessary. See the other comment.

:

stacked.sample(1, weights=weights)
Out: 
1  2    12
dtype: int64

# Or without normalization, stacked.sample(1, weights=stacked)

DataFrame.sample , . :

df.sample(1, weights=[0.4, 0.3, 0.1, 0.1, 0.05, 0.05])
Out: 
    0  1   2  3   4   5
1  24  3  12  6  21  15

( 40% , 30% ..)

:

df.sample(1, weights=[0.4, 0.3, 0.1, 0.1, 0.05, 0.05], axis=1)
Out: 
   1
0  5
1  3
2  9
3  1
4  2
5  6

, 40% , . , , , - . 2D 1D.

df.stack()

Out: 
0  0    40
   1     5
   2    20
   3    10
   4    35
   5    25
1  0    24
   1     3
   2    12
   3     6
   4    21
   5    15
2  0    72
   1     9
   2    36
   3    18
   4    63
   5    45
3  0     8
   1     1
   2     4
   3     2
   4     7
   5     5
4  0    16
   1     2
   2     8
   3     4
   4    14
   5    10
5  0    48
   1     6
   2    24
   3    12
   4    42
   5    30
dtype: int64

, , . :

df.stack().sample()
Out: 
1  0    24
dtype: int64

1 0.

+4

Source: https://habr.com/ru/post/1681907/


All Articles