I am trying to create a correlation matrix from a pandas frame using data from the specified columns
Here are my csv details:
col0,col1,col2,col3,col4
122468.9071,1417464.203,3546600,151804924,10839476
14691.1139,170036.0407,103847,19208604,2365065
Here are two data boxes I created:
df1 = pd.read_csv('c:/temp/test_1.csv', usecols=[0])
df2 = pd.read_csv('c:/temp/test_1.csv', usecols=[1])
I tried the corr and corrwith functions and got the following errors:
Corr Function:
print df1.corr(df2)
Result:
Error: Could not compare ['pearson'] with block values
Corrwith:
print df1.corrwith(df2)
Result:
col0 NaN
col1 NaN
dtype: float64
As you can see, there are no null values ββin the dataset, and float64 should be able to handle decimals.
Any help on the decision would be greatly appreciated.
Tiberius
source
share