JDBC call to impala / hive from a spark job and table creation

Question

JDBC call to impala / hive from a spark job and table creation

I am trying to write a spark job in scala that will open a jdbc connection with Impala and let me create a table and do other operations.

How should I do it? Any example would be very helpful. Thank!

+2

scala jdbc apache-spark impala

user1189851 Oct 29 '14 at 15:48

source share

1 answer

Ian · Accepted Answer · 2015-08-18T13:10:32+0000

val JDBCDriver = "com.cloudera.impala.jdbc41.Driver"
val ConnectionURL = "jdbc:impala://url.server.net:21050/default;auth=noSasl"

Class.forName(JDBCDriver).newInstance
val con = DriverManager.getConnection(ConnectionURL)
val stmt = con.createStatement()
val rs = stmt.executeQuery(query)

val resultSetList = Iterator.continually((rs.next(), rs)).takeWhile(_._1).map(r => {
    getRowFromResultSet(r._2) // (ResultSet) => (spark.sql.Row)
}).toList

sc.parallelize(resultSetList)

JDBC call to impala / hive from a spark job and table creation

More articles: