How to update Row / column value in Apache Spark DataFrame?

Hi, I have an ordered Spark DataFrame, and I would like to change a few lines when repeating it with the following code, but it seems that there is no way to update the Row object

orderedDataFrame.foreach(new Function1<Row,BoxedUnit>(){

@Override
public BoxedUnit apply(Row v1) {
//how to I change Row here 
//I want to change column no 2 using v1.get(2)
//also what is BoxedUnit how do I use it
return null;
}
});

Also the above code gives a compilation error saying "myclassname is not abstract and it does not override abstract method apply$mcVj$sp(long) in scala Function 1" Please direct. I am new to Spark. I am using version 1.4.0.

+4
source share
1 answer

Try the following:

 final DataFrame withoutCurrency = sqlContext.createDataFrame(somedf.javaRDD().map(row -> {
            return RowFactory.create(row.get(0), row.get(1), someMethod(row.get(2)));
        }), somedf.schema());
+6
source

Source: https://habr.com/ru/post/1598094/


All Articles