Spark error RDD type not found when creating RDD

I am trying to create RDD objects of class case. For instance,

// sqlContext from the previous example is used in this example. // createSchemaRDD is used to implicitly convert an RDD to a SchemaRDD. import sqlContext.createSchemaRDD val people: RDD[Person] = ... // An RDD of case class objects, from the previous example. // The RDD is implicitly converted to a SchemaRDD by createSchemaRDD, allowing it to be stored using Parquet. people.saveAsParquetFile("people.parquet") 

I am trying to complete part of the previous example by specifying

  case class Person(name: String, age: Int) // Create an RDD of Person objects and register it as a table. val people: RDD[Person] = sc.textFile("/user/root/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt)) people.registerAsTable("people") 

I get the following error:

 <console>:28: error: not found: type RDD val people: RDD[Person] =sc.textFile("/user/root/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt)) 

Any idea on what went wrong? Thanks in advance!

+5
source share
1 answer

The problem is an explicit annotation of the RDD[String] . It seems that RDD not imported by default into the spark-shell , so Scala complains that it cannot find the RDD type. First try running import org.apache.spark.rdd.RDD .

+23
source

Source: https://habr.com/ru/post/1205790/


All Articles