Hive jdbc performance improvement

Question

Hive jdbc performance improvement

Does aynyone know how to improve performance for JDBC connections for HIVE.

Detailed problem:

When I request a hive from the Hive CLI, I get a response within 7 seconds, but from a JDBC connection from HIVE I get a response after 14 seconds. I was wondering if there are any changes (configuration changes) with which I can improve the performance for a query through a JDBC connection.

Thanks in advance.

+4

performance jdbc hadoop hive hortonworks-data-platform

techprat Jun 19 '17 at 11:44

source share

4 answers

techprat · Answer 1 · 2017-07-04T12:46:06+0000

JVBC . , , , , .

, , - , , .

kumsgs · Answer 2 · 2017-06-28T14:55:16+0000

.

, hive.auto.convert.join true.
Java Heap Size Garbage Collection
Tez, set hive.execution.engine = tez , hive.execution.engine.

Hive

, .

Srini Sydney · Answer 3 · 2017-06-28T21:58:36+0000

jdbc jdbc - , ( jdbc 3.0). hive cli .

-- enable cost based optimizer
set hive.cbo.enable=true;
set hive.compute.query.using.stats=true;
set hive.stats.fetch.column.stats=true;
set hive.stats.fetch.partition.stats=true;

--collects statistics
analyze table <TABLENAME> compute statistics for columns;

--enable vectorization of queries.
set hive.vectorized.execution.enabled = true;
set hive.vectorized.execution.reduce.enabled = true;

,

Jean de lavarene · Answer 4 · 2017-06-30T12:55:32+0000

If your database is Oracle, you can try Oracle Table Access for Hadoop and Spark (OTA4H) , which can also be used from Hive QL, OTA4H optimizes JDBC queries to retrieve data from Oracle using splitters to get maximum performance. You can join Hive tables with external tables inside Oracle directly in your hive requests.

Hive jdbc performance improvement

More articles: