RDD serialization

I have an RDD that I am trying to serialize and then rebuild by deserializing. I am trying to figure out if this is possible in Apache Spark.

     static JavaSparkContext sc = new JavaSparkContext(conf);
        static SerializerInstance si = SparkEnv.get().closureSerializer().newInstance();
    static ClassTag<JavaRDD<String>> tag = scala.reflect.ClassTag$.MODULE$.apply(JavaRDD.class);
            JavaRDD<String> rdd = sc.textFile(logFile, 4);
            System.out.println("Element 1 " + rdd.first());
            ByteBuffer bb= si.serialize(rdd, tag);
            JavaRDD<String> rdd2 = si.deserialize(bb, Thread.currentThread().getContextClassLoader(),tag);
            System.out.println("Element 0 " + rdd2.first());

I get an exception in the last line when I perform an action on a newly created RDD. The way I serialize is similar to the way it is done inside Spark.

Exception in thread "main" org.apache.spark.SparkException: RDD transformations and actions can only be invoked by the driver, not inside of other transformations; for example, rdd1.map(x => rdd2.values.count() * x) is invalid because the values transformation and count action cannot be performed inside of the rdd1.map transformation. For more information, see SPARK-5063.
    at org.apache.spark.rdd.RDD.sc(RDD.scala:87)
    at org.apache.spark.rdd.RDD.take(RDD.scala:1177)
    at org.apache.spark.rdd.RDD.first(RDD.scala:1189)
    at org.apache.spark.api.java.JavaRDDLike$class.first(JavaRDDLike.scala:477)
    at org.apache.spark.api.java.JavaRDD.first(JavaRDD.scala:32)
    at SimpleApp.sparkSend(SimpleApp.java:63)
    at SimpleApp.main(SimpleApp.java:91)

RDD is created and loaded as part of a single process, so I don’t understand how this error occurs.

I am the author of this warning message.

Spark RDD, . RDD , , RDD.

RDD , SparkContext, (. ), Spark NullPointerException, Spark , RDD.sc.

, NullPointerExceptions , rdd1.map { _ => rdd2.count() }, , RDD -. , - / RDD , , .


