I need to replace some values in Column DataFrame(zeros and zeros for the mode, I know that this approach is not very accurate, but I just practice). I own the PythonApache Spark documentation , and the examples are usually more explanatory. So I decided to take a look there first, apart from the Scala documentation, and I noticed that I can achieve what I need using replace from DataFrames.
In this example, I replace everything 2with 20in the column col.
df = df.replace("2", "20", subset="col")
After some confidence in the API, PythonI decided to replicate this to Scala, and I noticed some strange things in the document Scala. Firstly, it is obvious that it DataFramesdoes not have a method replace. Secondly, after some research, I noticed that I should use the replace DataFrameNaFunctions functionality , but this is a rare part, if you see the details of this method, you will notice that they use this function in the same way as in the implementation Python(see the figure below) .

After that, I tried to run this in Scala and exploded, showing the following error:
Name: Compile Error
Message: <console>:108: error: value replace is not a member of org.apache.spark.sql.DataFrame
val dx = df.replace(column, Map(0.0 -> doubleValue))
^
StackTrace:
Then I tried to apply replaceusing DataFrameNaFunctions, but I can’t get it to work as easy as in Python, because I got an error and I don’t understand why.
val dx = df.na.replace(column, Map(0.0 -> doubleValue))
The error comes:
Name: Compile Error
Message: <console>:108: error: overloaded method value replace with alternatives:
[T](cols: Seq[String], replacement: scala.collection.immutable.Map[T,T])org.apache.spark.sql.DataFrame <and>
[T](col: String, replacement: scala.collection.immutable.Map[T,T])org.apache.spark.sql.DataFrame <and>
[T](cols: Array[String], replacement: java.util.Map[T,T])org.apache.spark.sql.DataFrame <and>
[T](col: String, replacement: java.util.Map[T,T])org.apache.spark.sql.DataFrame
cannot be applied to (String, scala.collection.mutable.Map[Double,Double])
val dx = df.na.replace(column, Map(0.0 -> doubleValue))
^