Overloaded method error using spark-csv

Question

Overloaded method error using spark-csv

I work with the spark-csv Databricks package (through the Scala API) and have problems defining a custom schema.

After starting the console with

spark-shell  --packages com.databricks:spark-csv_2.11:1.2.0

I import the necessary types

import org.apache.spark.sql.types.{StructType, StructField, StringType, IntegerType}

and then try defining this circuit:

val customSchema = StructType(
    StructField("user_id", IntegerType, true),
    StructField("item_id", IntegerType, true),
    StructField("artist_id", IntegerType, true),
    StructField("scrobble_time", StringType, true))

but I get the following error:

<console>:26: error: overloaded method value apply with alternatives:
  (fields: Array[org.apache.spark.sql.types.StructField])org.apache.spark.sql.types.StructType <and>
  (fields: java.util.List[org.apache.spark.sql.types.StructField])org.apache.spark.sql.types.StructType <and>
  (fields: Seq[org.apache.spark.sql.types.StructField])org.apache.spark.sql.types.StructType
 cannot be applied to (org.apache.spark.sql.types.StructField, org.apache.spark.sql.types.StructField, org.apache.spark.sql.types.StructField, org.apache.spark.sql.types.StructField)
       val customSchema = StructType(

I am very new to scala, therefore, not understanding this, but what am I doing wrong here? I follow a very simple example here .

+4

scala apache-spark apache-spark-sql

mustachio Nov 10 '15 at 17:08

source share

1 answer

Rohan Aletty · Accepted Answer · 2015-11-10T17:24:35+0000

You need to pass your set StructFieldas Seq.

Something like any of the following:

val customSchema = StructType(Seq(StructField("user_id", IntegerType, true), StructField("item_id", IntegerType, true), StructField("artist_id", IntegerType, true), StructField("scrobble_time", StringType, true)))

val customSchema = (new StructType)
  .add("user_id", IntegerType, true)
  .add("item_id", IntegerType, true)
  .add("artist_id", IntegerType, true)
  .add("scrobble_time", StringType, true)

val customSchema = StructType(StructField("user_id", IntegerType, true) :: StructField("item_id", IntegerType, true) :: StructField("artist_id", IntegerType, true) :: StructField("scrobble_time", StringType, true) :: Nil)

, README, StructType, .

Overloaded method error using spark-csv

More articles: