Multiple rows for row in pandas python dataframe

For a column in a pandas DataFrame with multiple rows, I want to create a new column with the specified number of rows that form the row subselect of the previous column. I am trying to do this in order to create a large data matrix containing ranges of values ​​as input for the model later.

As an example, I have a small DataFrame as follows:

df:
    A
1   1
2   2
3   3
.   ..

In this DataFrame, I would like to add 3 rows to the row in the “A” column of the DataFrame, forming a new column named “B”. The result should be something like this:

df:
    A   B
1   1   1
2   1   2
3   1   3
4   2   1
5   2   2
6   2   3
7   3   1
8   3   2
9   3   3
.   ..  ..

, if - DataFrame, iterrows() "" , , . "A".

- , ?

,

+4
3

, numpy.repeat numpy.tile DataFrame:

df = pd.DataFrame({'A':np.repeat(df['A'].values, 3), 
                   'B':np.tile(df['A'].values, 3)})
print (df)
   A  B
0  1  1
1  1  2
2  1  3
3  2  1
4  2  2
5  2  3
6  3  1
7  3  2
8  3  3
+2
In [28]: pd.DataFrame({'A':np.repeat(df.A.values, 3), 'B':np.tile(df.A.values,3)})
Out[28]:
   A  B
0  1  1
1  1  2
2  1  3
3  2  1
4  2  2
5  2  3
6  3  1
7  3  2
8  3  3
+2

NumPy np.repeat , -

In [282]: df.A
Out[282]: 
1    4
2    9
3    5
Name: A, dtype: int64

In [288]: r = np.repeat(df.A.values[:,None],3,axis=1)

In [289]: pd.DataFrame(np.c_[r.ravel(), r.T.ravel()], columns=[['A','B']])
Out[289]: 
   A  B
0  4  4
1  4  9
2  4  5
3  9  4
4  9  9
5  9  5
6  5  4
7  5  9
8  5  5
+1

Source: https://habr.com/ru/post/1681639/


All Articles