We are reading data from CollectionMongoDB. A column Collectionhas two different meanings (for example:) (bson.Int64,int) (int,float).
I am trying to get a data type using pyspark.
My problem is that some columns have different data types.
Suppose quantityand weightare columns
quantity weight
--------- --------
12300 656
123566000000 789.6767
1238 56.22
345 23
345566677777789 21
In fact, we did not define a data type for any column of the Mongo collection.
When I request an invoice from pyspark dataframe
dataframe.count()
I got an exception like this
"Cannot cast STRING into a DoubleType (value: BsonString{value='200.0'})"
source
share