Connect SparkSQL HiveServer to Cassandra?

So, I work with Tableau, Spark 1.2 and Cassandra 2.1.2. I managed to do a few things.

My main gap in this question is how to properly configure Spark 1.2 ThriftServer to be able to talk to my Cassandra instance? The ultimate goal is to run SparkSQL through Tableau (ThriftServer required). I can run ThriftServer without problems (basically), where I can run beeline, as in the examples, and make a call to "show tables". But, as you can see below, this leads to a list of strings of length 0.

beeline> !connect jdbc:hive2://192.168.56.115:10000 scan complete in 2ms Connecting to jdbc:hive2://192.168.56.115:10000 Enter username for jdbc:hive2://192.168.56.115:10000: Enter password for jdbc:hive2://192.168.56.115:10000: log4j:WARN No appenders could be found for logger (org.apache.thrift.transport.TSaslTransport). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Connected to: Spark SQL (version 1.2.0) Driver: null (version null) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://192.168.56.115:10000> show tables; +---------+ | result | +---------+ +---------+ No rows selected (1.755 seconds) 0: jdbc:hive2://192.168.56.115:10000> 
  • Do I need a datastax connector? I must assume that the answer is yes.
  • Does hive-site.xml need to be declared, although I am not using Hive to the least extent?
  • Can I run this setup without Hive / Metastore? Or is this a ThriftServer requirement in Spark 1.2?
  • Assuming my existing Spark Master / Worker settings are correct, but may be incorrect.

Help!:)

+6
source share

Source: https://habr.com/ru/post/981287/


All Articles