Numpy.shuffle gives the same results every time.

I am trying to take a pandas DataFrame, take out 1 column, shuffle the contents of that column, and then put it back in the DataFrame and return it. This is the code used:

def randomize(self, data, column):
    '''Takes in a pandas database and randomizes the values in column.

    data is the pandas dataframe to be altered.
    column is the column in the dataframe to be randomized.

    returns the altered dataframe.
    '''
    df1 = data
    df1.drop(column, 1)
    newcol = list(data[column])
    np.random.shuffle(newcol)
    df1[column] = newcol
    return df1

It gives the same result every time I run it. Why is this?

Note. I use the same data file every time.

+4
source share
1 answer

Your code

def randomize(data, column):
    df1 = data.copy()
    newcol = list(data[column])
    np.random.shuffle(newcol)
    df1[column] = newcol
    return df1

My df

df = pd.DataFrame(np.arange(25).reshape(5, 5), list('abcde'), list('ABCDE'))

Your code + My df

np.random.seed([3,1415])
randomize(df, 'A')

enter image description here

And again

randomize(df, 'A')

enter image description here

It looks like it works!

+1
source

Source: https://habr.com/ru/post/1650729/


All Articles