This raises the question connect from java to Hive , but my other
My bush is running on machine1, and I need to pass some queries using a Java server running on machine2. As far as I understand, Hive has a JDBC interface for receiving remote queries. I took the code here - HiveServer2 Clients
I installed the dependencies written in the article: -
- hive-jdbc * .jar
- hive service * .jar
- libfb303-0.9.0.jar
- libthrift-0.9.0.jar
- log4j-1.2.16.jar
- SLF4J-api-1.6.1.jar
- SLF4J-log4j12-1.6.1.jar
- General Logging 1.0.4.jar
However, during compilation, I got java.lang.NoClassDefFoundError Full error:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration at org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:393) at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:187) at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:163) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:215) at com.bidstalk.tools.RawLogsQuerySystem.HiveJdbcClient.main(HiveJdbcClient.java:25)
Another StackOverflow question recommends adding Hadoop API dependencies in Maven - Beehive Error
I donβt understand why I need hadoop API to connect the client to Hive. Should the JDBC driver be an agnostic of the underlying query system? Do I just need to pass some SQL query?
Edit: I am using Cloudera (5.3.1), I think I need to add CDH dependencies. Cloudera instance launches hasoop 2.5.0 and HiveServer2
But the servers are on machine 1. On the machine, the code should at least compile, and I should only have problems at runtime!
source share