How to write a dataset object to succeed in spark java?

Question

How to write a dataset object to succeed in spark java?

I am reading an excel file using the com.crealytics.spark.excel package . Below is the code for reading excel file in spark java.

    Dataset<Row> SourcePropertSet = sqlContext.read()
               .format("com.crealytics.spark.excel")
               .option("location", "D:\\5Kto10K.xlsx")
               .option("useHeader", "true")
               .option("treatEmptyValuesAsNulls", "true")
               .option("inferSchema", "true")
               .option("addColorColumns", "false")
               .load("com.databricks.spark.csv");

But I tried with the same package (com.crealytics.spark.excel) to write a dataset object to an excel file in spark java.

    SourcePropertSet.write()
          .format("com.crealytics.spark.excel")
          .option("useHeader", "true")
          .option("treatEmptyValuesAsNulls", "true")
          .option("inferSchema", "true")
          .option("addColorColumns", "false").save("D:\\resultset.xlsx");

But I am getting below the error.

java.lang.RuntimeException: com.crealytics.spark.excel.DefaultSource does not allow creating a table as select.

And even I tried with the org.zuinnote.spark.office.excel package . below is the code for this.

    SourcePropertSet.write()
             .format("org.zuinnote.spark.office.excel")
             .option("write.locale.bcp47", "de") 
             .save("D:\\result");

I added the following dependencies to my pom.xml

<dependency>
              <groupId>com.github.zuinnote</groupId>
              <artifactId>hadoopoffice-fileformat</artifactId>
              <version>1.0.0</version>
          </dependency>
        <dependency>
            <groupId>com.github.zuinnote</groupId>
            <artifactId>spark-hadoopoffice-ds_2.11</artifactId>
            <version>1.0.3</version>
        </dependency>

But I am getting below the error.

java.lang.IllegalAccessError: org.zuinnote.hadoop.office.format.mapreduce.ExcelFileOutputFormat.getSuffix(Ljava/lang/String;) Ljava/lang/String; org.zuinnote.spark.office.excel.ExcelOutputWriterFactory

, excel java.

+4

apache-spark pyspark apache-spark-sql spark-dataframe

BHANUMATHI H M 24 . '17 7:23

2

, , HaodoopOffice. , 1.0.3 1.0.4 . ? :

 SourcePropertSet.write()
             .format("org.zuinnote.spark.office.excel")
             .option("spark.write.useHeader",true)
             .option("write.locale.bcp47", "us") 
             .save("D:\\result");

1.0.4 Spark2 HadoopOffice :

 Dataset<Row> SourcePropertSet = sqlContext.read()
               .format("org.zuinnote.spark.office.excel")
               .option("spark.read.useHeader", "true")
               .option("spark.read.simpleMode", "true")
               .load("D:\\5Kto10K.xlsx");

, Excel POI .

: https://github.com/ZuInnoTe/spark-hadoopoffice-ds

0

Jörn Franke 31 . '17 22:56

Nikolay Vasiliev · Accepted Answer · 2017-06-25T11:15:51+0000

, , com.crealytics.spark.excel, , excel. Apache POI Excel, examples.

, CSV Excel, spark-csv . :

sourcePropertySet.write
    .format("com.databricks.spark.csv")
    .option("header", "true")
    .save("D:\\resultset.csv");

, Spark 1 , .repartition(1), .

How to write a dataset object to succeed in spark java?

More articles: