Avoid "Task is not serializable" with nested method in class

I understand the common problem, "Task is not serializable," which occurs when accessing a field or method that goes beyond closing.

To fix this, I usually define a local copy of these fields / methods, which avoids serializing the whole class:

class MyClass(val myField: Any) { def run() = { val f = sc.textFile("hdfs://xxx.xxx.xxx.xxx/file.csv") val myField = this.myField println(f.map( _ + myField ).count) } } 

Now, if I define a nested function in the run method, it cannot be serialized:

 class MyClass() { def run() = { val f = sc.textFile("hdfs://xxx.xxx.xxx.xxx/file.csv") def mapFn(line: String) = line.split(";") val myField = this.myField println(f.map( mapFn( _ ) ).count) } } 

I do not understand, since I thought that "mapFn" would be in scope ... Even a stranger, if I define mapFn as val instead of def, then it works:

 class MyClass() { def run() = { val f = sc.textFile("hdfs://xxx.xxx.xxx.xxx/file.csv") val mapFn = (line: String) => line.split(";") println(f.map( mapFn( _ ) ).count) } } 

Is this related to the way Scala presents nested functions?

What is the recommended way to deal with this problem? Avoid nested functions?

+6
source share
1 answer

Does this not work so that in the first case f.map(mapFN(_)) equivalent to f.map(new Function() { override def apply(...) = mapFN(...) }) , and in the second - just f.map(mapFN) ? When you declare a method with def , it is probably just a method in some anonymous class with the $outer implicit reference to the surrounding class. But map requires Function , so the compiler should wrap it. In the wrapper, you simply refer to some method of this anonymous class, but not to the instance itself. If you use val , you have a direct link to the function that you pass to map . I'm not sure about this, just thinking out loud ...

+1
source

Source: https://habr.com/ru/post/969502/


All Articles