Splitting a data column into equal windows in Pandas

I have a dataframe, like the following, and I intend to fetch windows with size = 30, and then write a loop for each data block and call other functions.

index = pd.date_range(start='2016-01-01', end='2016-04-01', freq='D')
data = pd.DataFrame(np.random.rand(len(index)), index = index, columns=['random'])

I found the following function, but I am wondering if there is a more efficient way to do this.

def split(df, chunkSize = 30): 
    listOfDf = list()
    numberChunks = len(df) // chunkSize + 1
    for i in range(numberChunks):
        listOfDf.append(df[i*chunkSize:(i+1)*chunkSize])
    return listOfDf 
+4
source share
2 answers

You can use list comprehension. See This SO Post on how to access dfs and another way to break a data block.

n = 200000  #chunk row size
list_df = [df[i:i+n] for i in range(0,df.shape[0],n)]
+4
source

You can do this efficiently with NumPy array_split, for example:

import numpy as np

def split(df, chunkSize = 30):
    numberChunks = len(df) // chunkSize + 1
    return np.array_split(df, numberChunks, axis=0)

, NumPy, .

+2

Source: https://habr.com/ru/post/1682219/


All Articles