I am curious why the simple concatenation of two data frames in pandas:
shape: (66441, 1) dtypes: prediction int64 dtype: object isnull().sum(): prediction 0 dtype: int64 shape: (66441, 1) CUSTOMER_ID int64 dtype: object isnull().sum() CUSTOMER_ID 0 dtype: int64
the same form and both without NaN values
foo = pd.concat([initId, ypred], join='outer', axis=1) print(foo.shape) print(foo.isnull().sum())
can lead to many NaN values ββif combined.
(83384, 2) CUSTOMER_ID 16943 prediction 16943
How can I fix this problem and prevent the input of NaN values?
Trying to play it as
aaa = pd.DataFrame([0,1,0,1,0,0], columns=['prediction']) print(aaa) bbb = pd.DataFrame([0,0,1,0,1,1], columns=['groundTruth']) print(bbb) pd.concat([aaa, bbb], axis=1)
failed for example. It worked just fine, since no NaN values ββwere entered.