How to handle Exception in spark map () function?

I want to ignore the Exception function in map (), for example:

rdd.map(_.toInt) 

where rdd is RDD[String] .

but if it encounters a non-numeric string, it will not be executed.

What is an easy way to ignore any exception and ignore this line? (I do not want to use a filter to handle the exception, because there can be so many other exceptions ...)

+6
source share
2 answers

You can use the Try combination and display / filter.

Try to include the calculation in Success if they behave as expected, or Failure if an exception is thrown. Then you can filter what you want - in this case, successful calculations, but you can also filter out cases of errors for logging, for example.

The following code is a possible starting point. You can run and study it at scastie.org to find out if it suits your needs.

 import scala.util.Try object Main extends App { val in = List("1", "2", "3", "abc") val out1 = in.map(a => Try(a.toInt)) val results = out1.filter(_.isSuccess).map(_.get) println(results) } 
+19
source

I recommend you use a filter / card

 rdd.filter(r=>NumberUtils.isNumber(r)).map(r=> r.toInt) 

or flatmap

 exampleRDD.flatMap(r=> {if (NumberUtils.isNumber(r)) Some(r.toInt) else None}) 

Otherwise, you can catch an exception in the map function

 myRDD.map(r => { try{ r.toInt }catch { case runtime: RuntimeException => { -1 } } }) 

and then apply the filter (-1)

+4
source

Source: https://habr.com/ru/post/986478/


All Articles