I have a problem reading data from a remote HiveServer2 using JDBC and SparkSession in Apache Zeppelin.
Here is the code.
%spark
import org.apache.spark.sql.Row
import org.apache.spark.sql.SparkSession
val prop = new java.util.Properties
prop.setProperty("user","hive")
prop.setProperty("password","hive")
prop.setProperty("driver", "org.apache.hive.jdbc.HiveDriver")
val test = spark.read.jdbc("jdbc:hive2://xxx.xxx.xxx.xxx:10000/", "tests.hello_world", prop)
test.select("*").show()
When I run this, I have no errors, but no data, I just retrieve the entire column name of the table, for example:
+--------------+
|hello_world.hw|
+--------------+
+--------------+
Instead of this:
+--------------+
|hello_world.hw|
+--------------+
+ data_here +
+--------------+
I run it all: Scala 2.11.8, OpenJDK 8, Zeppelin 0.7.0, Spark 2.1.0 ( bde / spark ), Hive 2.1.1 ( bde / hive )
I run this setup in Docker, each of which has its own container, but is connected on the same network.
Also, it just works when I use a spark beeline to connect to a remote hive.
Did I forget something? Any help would be greatly appreciated. Thanks in advance.
EDIT:
, docker Spark Hive, , Hive spark-defaults.conf. SparkSession JDBC. , :
"X" .
, .