How to write a dataset object to succeed in spark java?

I am reading an excel file using the com.crealytics.spark.excel package . Below is the code for reading excel file in spark java.

    Dataset<Row> SourcePropertSet = sqlContext.read()
               .format("com.crealytics.spark.excel")
               .option("location", "D:\\5Kto10K.xlsx")
               .option("useHeader", "true")
               .option("treatEmptyValuesAsNulls", "true")
               .option("inferSchema", "true")
               .option("addColorColumns", "false")
               .load("com.databricks.spark.csv");

But I tried with the same package (com.crealytics.spark.excel) to write a dataset object to an excel file in spark java.

    SourcePropertSet.write()
          .format("com.crealytics.spark.excel")
          .option("useHeader", "true")
          .option("treatEmptyValuesAsNulls", "true")
          .option("inferSchema", "true")
          .option("addColorColumns", "false").save("D:\\resultset.xlsx");

But I am getting below the error.

java.lang.RuntimeException: com.crealytics.spark.excel.DefaultSource does not allow creating a table as select.

And even I tried with the org.zuinnote.spark.office.excel package . below is the code for this.

    SourcePropertSet.write()
             .format("org.zuinnote.spark.office.excel")
             .option("write.locale.bcp47", "de") 
             .save("D:\\result");

I added the following dependencies to my pom.xml

<dependency>
              <groupId>com.github.zuinnote</groupId>
              <artifactId>hadoopoffice-fileformat</artifactId>
              <version>1.0.0</version>
          </dependency>
        <dependency>
            <groupId>com.github.zuinnote</groupId>
            <artifactId>spark-hadoopoffice-ds_2.11</artifactId>
            <version>1.0.3</version>
        </dependency> 

But I am getting below the error.

java.lang.IllegalAccessError: org.zuinnote.hadoop.office.format.mapreduce.ExcelFileOutputFormat.getSuffix(Ljava/lang/String;) Ljava/lang/String; org.zuinnote.spark.office.excel.ExcelOutputWriterFactory

, excel java.

+4
2

, , com.crealytics.spark.excel, , excel. Apache POI Excel, examples.

, CSV Excel, spark-csv . :

sourcePropertySet.write
    .format("com.databricks.spark.csv")
    .option("header", "true")
    .save("D:\\resultset.csv");

, Spark 1 , .repartition(1), .

+1

, , HaodoopOffice. , 1.0.3 1.0.4 . ? :

 SourcePropertSet.write()
             .format("org.zuinnote.spark.office.excel")
             .option("spark.write.useHeader",true)
             .option("write.locale.bcp47", "us") 
             .save("D:\\result");

1.0.4 Spark2 HadoopOffice :

 Dataset<Row> SourcePropertSet = sqlContext.read()
               .format("org.zuinnote.spark.office.excel")
               .option("spark.read.useHeader", "true")
               .option("spark.read.simpleMode", "true")
               .load("D:\\5Kto10K.xlsx");

, Excel POI .

: https://github.com/ZuInnoTe/spark-hadoopoffice-ds

0

Source: https://habr.com/ru/post/1680012/


All Articles