SparkSession returns nothing with HiveServer2 connection via JDBC

I have a problem reading data from a remote HiveServer2 using JDBC and SparkSession in Apache Zeppelin.

Here is the code.

%spark

import org.apache.spark.sql.Row
import org.apache.spark.sql.SparkSession

val prop = new java.util.Properties
prop.setProperty("user","hive")
prop.setProperty("password","hive")
prop.setProperty("driver", "org.apache.hive.jdbc.HiveDriver")

val test = spark.read.jdbc("jdbc:hive2://xxx.xxx.xxx.xxx:10000/", "tests.hello_world", prop)

test.select("*").show()

When I run this, I have no errors, but no data, I just retrieve the entire column name of the table, for example:

+--------------+
|hello_world.hw|
+--------------+
+--------------+

Instead of this:

+--------------+
|hello_world.hw|
+--------------+
+ data_here    +
+--------------+

I run it all: Scala 2.11.8, OpenJDK 8, Zeppelin 0.7.0, Spark 2.1.0 ( bde / spark ), Hive 2.1.1 ( bde / hive )

I run this setup in Docker, each of which has its own container, but is connected on the same network.

Also, it just works when I use a spark beeline to connect to a remote hive.

Did I forget something? Any help would be greatly appreciated. Thanks in advance.

EDIT:

, docker Spark Hive, , Hive spark-defaults.conf. SparkSession JDBC. , :

  • Spark Hive
  • spark-defaults.conf :

    spark.serializer     org.apache.spark.serializer.KryoSerializer
    
    spark.driver.memory              Xg
    
    spark.driver.cores       X
    
    spark.executor.memory        Xg
    
    spark.executor.cores         X
    
    spark.sql.warehouse.dir         file:///your/path/here
    

"X" .

, .

+4

Source: https://habr.com/ru/post/1667152/


All Articles