I am new to Python. I am trying to use sklearn.cluster. Here is my code:
from sklearn.cluster import MiniBatchKMeans
kmeans=MiniBatchKMeans(n_clusters=2)
kmeans.fit(df)
But I get the following error:
50 and not np.isfinite(X).all()):
51 raise ValueError("Input contains NaN, infinity"
---> 52 " or a value too large for %r." % X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64')
I checked that there is no Nan or infinity value. Thus, only one option remains. However, my data information tells me that all variables are float64, so I don’t understand where this problem comes from.
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 362358 entries, 135 to 4747145
Data columns (total 8 columns):
User 362358 non-null float64
Hour 362352 non-null float64
Minute 362352 non-null float64
Day 362352 non-null float64
Month 362352 non-null float64
Year 362352 non-null float64
Latitude 362352 non-null float64
Longitude 362352 non-null float64
dtypes: float64(8)
memory usage: 24.9 MB
Many thanks,
Mitch source
share