I wrote a small piece of code for linear regression using sklearn.
I created a csv file with two columns (column names X, Y with some numbers) and when I read the file, I see that the contents are correctly read - as shown below.
However, when I try to reference a column using commands datafile[:,:]or datafile[:,-1]etc., I get the message "unhashable type".
And when I try to use X as the answer, Y as a predictor in the linear regression of sklearn, I get a Value error, as shown below.
I looked online, but could not understand what was wrong with my code or file. Please, help.
import pandas as pd
datafile=pd.read_csv('samplelinear.csv')
datafile
X Y
0 0 1.440000
1 1 33.220000
. . .
print datafile.__class__
<class 'pandas.core.frame.DataFrame'>
datafile[:,:]
TypeError: unhashable type
datafile[:,:1]
TypeError: unhashable type
from sklearn.linear_model import LinearRegression
model=LinearRegression()
model.fit(datafile.X,datafile.Y)
ValueError: Found arrays with inconsistent numbers of samples: [ 1 14]
source
share