It would be lacquered if you could fill in NaNwords 0while reading. Perhaps the function request in Pandas git-hub is fine ...
Using the converter function
However, for now, you can define your own function for this and pass it to the argument convertersin read_csv:
def conv(val):
if val == np.nan:
return 0
return val
df = pd.read_csv(file, converters={colWithNaN : conv}, dtypes=...)
Note that it convertersaccepts dict, so you need to specify it for each column that has NaN. This can become a little tedious if many columns are affected. You can specify column names or column numbers as keys.
, read_csv , converters. , , NaN , lambda:
df = pd.read_csv(file, converters={colWithNaN : lambda x: 0 if x == np.nan else x}, dtypes=...)
, , . . :
result = pd.DataFrame()
df = pd.read_csv(file, chunksize=1000)
for chunk in df:
chunk.dropna(axis=0, inplace=True)
chunk[colToConvert] = chunk[colToConvert].astype(np.uint32)
result = result.append(chunk)
del df, chunk
, . , chunk , result.append, chunksize, . , .