Connect Hive thorugh Java JDBC

This raises the question connect from java to Hive , but my other

My bush is running on machine1, and I need to pass some queries using a Java server running on machine2. As far as I understand, Hive has a JDBC interface for receiving remote queries. I took the code here - HiveServer2 Clients

I installed the dependencies written in the article: -

  • hive-jdbc * .jar
  • hive service * .jar
  • libfb303-0.9.0.jar
  • libthrift-0.9.0.jar
  • log4j-1.2.16.jar
  • SLF4J-api-1.6.1.jar
  • SLF4J-log4j12-1.6.1.jar
  • General Logging 1.0.4.jar

However, during compilation, I got java.lang.NoClassDefFoundError Full error:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration at org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:393) at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:187) at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:163) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:215) at com.bidstalk.tools.RawLogsQuerySystem.HiveJdbcClient.main(HiveJdbcClient.java:25) 

Another StackOverflow question recommends adding Hadoop API dependencies in Maven - Beehive Error

I don’t understand why I need hadoop API to connect the client to Hive. Should the JDBC driver be an agnostic of the underlying query system? Do I just need to pass some SQL query?

Edit: I am using Cloudera (5.3.1), I think I need to add CDH dependencies. Cloudera instance launches hasoop 2.5.0 and HiveServer2

But the servers are on machine 1. On the machine, the code should at least compile, and I should only have problems at runtime!

+6
source share
5 answers

Answering my question!

With some hits and trial versions, I added the following dependencies to my pom file, and from then on I can run the code for both CHD 5.3.1 cluster and 5.2.1.

 <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-jdbc</artifactId> <version>0.13.1-cdh5.3.1</version> </dependency> <dependency> <groupId>org.apache.thrift</groupId> <artifactId>libthrift</artifactId> <version>0.9.0</version> </dependency> <dependency> <groupId>org.apache.thrift</groupId> <artifactId>libfb303</artifactId> <version>0.9.0</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-core</artifactId> <version>2.5.0-mr1-cdh5.3.1</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>2.5.0-cdh5.3.1</version> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>0.13.1-cdh5.3.1</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-hdfs</artifactId> <version>2.5.0-cdh5.3.1</version> </dependency> <dependency> 

Please note that some of these dependencies may not be required.

+3
source

If you have not decided this yet, I let him go. And I needed the following dependencies to compile and run it:

 libthrift-0.9.0-cdh5-2.jar httpclient-4.2.5.jar httpcore-4.2.5.jar commons-logging-1.1.3.jar hive-common.jar slf4j-api-1.7.5.jar hive-metastore.jar hive-service.jar hadoop-common.jar hive-jdbc.jar guava-11.0.2.jar 

The hive documentation is probably written in relation to an older version / distribution.

Your exception is due to the missing hadoop-common bank, which has org.apache.hadoop.conf.Configuration .

Hope this helps.

+7
source

Getting the same error when trying to use hive-jdbc 1.2.1 against hive 0.13. Compared to the long list in other answers. Now we use these two:

 hive-jdbc-1.2.1-standalone.jar hadoop-common-2.7.1.jar 

One more note: you can get " Required field" client_protocol "not set! " When using the last jdbc against the old hive. If so, change the jdbc version to 1.1.0:

 <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-jdbc</artifactId> <version>1.1.0</version> <classifier>standalone</classifier> </dependency> 
+4
source

Others are interested in knowing exactly what is required to remotely remove an HIVE request using java ...

Java code

 import java.sql.Connection; import java.sql.DriverManager; import java.sql.SQLException; import java.sql.Statement; public class Runner { private static String driverName = "org.apache.hive.jdbc.HiveDriver"; public static void main(String[] args) throws SQLException { try { // Register driver and create driver instance Class.forName(driverName); } catch (ClassNotFoundException ex) { ex.printStackTrace(); } // get connection System.out.println("before trying to connect"); Connection con = DriverManager.getConnection("jdbc:hive2://[HOST IP]:10000/", "hive", ""); System.out.println("connected"); // create statement Statement stmt = con.createStatement(); // execute statement stmt.executeQuery("show tables"); con.close(); } } 

Along with the pom file with the only necessary dependencies.

 <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>test-executor</groupId> <artifactId>test-executor</artifactId> <version>1.0-SNAPSHOT</version> <properties> <hadoop.version>2.5.2</hadoop.version> </properties> <dependencies> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>1.2.1</version> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-jdbc</artifactId> <version>1.2.1</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-hdfs</artifactId> <version>${hadoop.version}</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>${hadoop.version}</version> </dependency> </dependencies> </project> 
+1
source

I ran into the same issue with the CDH5.4.1 version. I updated my POM file with the code below and it worked for me.

My version is Hadoop Hadoop 2.6.0-cdh5.4.1 , and the version is Hive Hive 1.1.0-cdh5.4.1

 <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>0.13.0</version> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-jdbc</artifactId> <version>0.13.0</version> </dependency> <dependency> <groupId>org.apache.thrift</groupId> <artifactId>libthrift</artifactId> <version>0.9.0</version> </dependency> <dependency> <groupId>org.apache.thrift</groupId> <artifactId>libfb303</artifactId> <version>0.9.0</version> </dependency> <dependency> <groupId>commons-logging</groupId> <artifactId>commons-logging</artifactId> <version>1.1.3</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>2.6.0</version> </dependency> 

I decided with this POM update.

+1
source

Source: https://habr.com/ru/post/983090/


All Articles