I load a two-dimensional dataset into memory using Pandas and perform 4 simple Machine Learning preprocessing tasks such as adding / removing columns, reindexing, split train / test.
MLMe = pd.read_table("data/dtCTG.txt", ",")
MLMe.rename(columns={'NSP' : 'class'}, inplace=True)
MLMe_class = MLMe['class'].values
training_indices, validation_indices = training_indices, testing_indices = train_test_split(
MLMe.index, stratify = MLMe_class, train_size=0.75, test_size=0.25)
X_train = MLMe.drop('class',axis=1).loc[training_indices].values
y_train = MLMe.loc[training_indices,'class'].values
X_test = MLMe.drop('class',axis=1).loc[validation_indices].values
y_test = MLMe.loc[validation_indices, 'class'].values
X_train, y_train, X_test, y_test
Now when I pass X_train, y_train dataframes to some libraries, I get an error that the buffers are no longer C-contiguous.
BufferError: memoryview: underlying buffer is not C-contiguous
My question is: How can I make X_train, y_train C-contiguous buffers? I tried reformatting with options C and F, but no luck.
EDIT: here is the form, dtypes and flags for data frames:
X_train.shape, y_train.shape, X_test.shape, y_test.shape
((1104, 9), (1104,), (369, 9), (369,))
X_train.dtype, y_train.dtype, X_test.dtype, y_test.dtype
(dtype('int64'), dtype('int64'), dtype('int64'), dtype('int64'))
X_train.flags, y_train.flags, X_test.flags, y_test.flags
( C_CONTIGUOUS : False
F_CONTIGUOUS : True
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False,
C_CONTIGUOUS : True
F_CONTIGUOUS : True
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False,
C_CONTIGUOUS : False
F_CONTIGUOUS : True
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False,
C_CONTIGUOUS : True
F_CONTIGUOUS : True
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
)