As far as I understand, there are two ways to connect to Hive
- using a bush metadata server, which then connects in the background to relational db, such as mysql, to manifest the schema. This is usually done on port 9083.
- a hive jdbc server called HiveServer2 that runs on port 10001, usually ...
Now, in earlier versions of the bush, hiveserver2 was not so stable before, and in fact support for multithreading was also limited. I would suggest that this arena may have improved.
So, for the JDBC api - yes, this will allow you to communicate using JDBC and sql.
There are apparently 2 functions for linking the metastor.
- to execute SQL queries - DML
- to perform DDL operations.
DDL -
for DDL, the metastore API is useful, org.apache.hadoop.hive.metastore.HiveMetaStoreClient HiveMetaStoreClient can be used for this purpose
DML -
what I found useful in this regard is org.apache.hadoop.hive.ql.Driver https://hive.apache.org/javadocs/r0.13.1/api/ql/org/apache/hadoop/hive/ ql / Driver.html hive.ql.Driver class This class has a method called run() that allows you to execute an SQL statement and return a result. eg. you can do the following
Driver driver = new Driver(hiveConf); HiveMetaStoreClient client = new HiveMetaStoreClient(hiveConf); SessionState.start(new CliSessionState(hiveConf)); driver.run("select * from employee); // DDL example client.dropTable(db, table);
source share