Org.apache.spark.sql.catalyst.errors.package $ TreeNodeException: execute, tree:

I am trying to register a simple UDF to extract date functionality in spark mode using the Scala Luna Eclipse IDE. This is my code:
sqlContext.udf.register("extract", (dateUnit: String, date : String) => udf.extract(dateUnit,date ) )

 def extract(dateUnit : String, date: String) :  String = {
    val splitArray : Array[String] = date.split("-") 
        val result  = dateUnit.toUpperCase() match {
      case "YEAR" => splitArray(0)
      case "MONTH" => splitArray(1)
      case "DAY" => splitArray(2)
      case whoa => "Unexpected case :" + whoa.toString()
    }
    return result ;
  }

When I execute this function, the Eclipse console through how Select * from date_dim WHERE d_dom < extract('YEAR', '2015-05-01') limit 10"

It throws ans error like

org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree:
Aggregate false, [], [Coalesce(SUM(PartialCount#30L),0) AS count#28L]
 Aggregate true, [], [COUNT(1) AS PartialCount#30L]
  Project []

    Caused by: org.apache.spark.SparkException: Task not serializable

    Caused by: java.lang.reflect.InvocationTargetException

    Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 
    at java.io.ObjectStreamClass$FieldReflector.getObjFieldValues(ObjectStreamClass.java:2030)
        at java.io.ObjectStreamClass.getObjFieldValues(ObjectStreamClass.java:1232)

I can’t find out exactly what the problem is, simple udfs that are directly determined by how sqlContext.udf.register("strLength", (str: String) => str.length() ) successfully it works. And the same function above works successfully through the Scala shell in sparks. What is the problem? Am I doing something wrong?

+4
source share

Source: https://habr.com/ru/post/1588961/


All Articles