Check Hive HQL syntax?

Is there a programmatic way to validate HiveQL statements for errors, such as basic syntax errors? I would like to check the instructions before sending them to the Elastic Map Reduce to save debugging time.

+4
source share
3 answers

Yes there is!

This is pretty easy.

Steps:

1. Get a catching client in your language.

I am in ruby, so I use this shell - https://github.com/forward/rbhive (gem install rbhive)

If you are not in a ruby, you can download the hive source and run thrift in the included savings configuration files to generate client code in most languages.

2. Connect to the hive on port 10001 and complete the description request

In ruby, it looks like this:

RBHive.connect(host, port) do |connection| connection.fetch("describe select * from categories limit 10") end 

If the request is invalid, the client will throw an exception with details about why the syntax is invalid. Describe will return you a query tree if the IS syntax is valid (in this case you can ignore)

Hope this helps.

+6
source

"describe select * from category limit 10" does not work for me.

Perhaps this is due to the version of Hive being used. I am using Hive 0.8.1.4

After doing some research, I found a similar solution for one Matthew Rathbone:

Hive provides an EXPLAIN command that displays a query execution plan. The syntax for this statement is as follows:

Request EXPLAIN [EXTENDED]

So, for everyone who also uses rbhive:

 RBHive.connect(host, port) do |c| c.execute("explain select * from categories limit 10") end 

Note that you need to substitute c.fetch with c.execute, as the explanation will not return any results if it is successful. => rbhive will throw an exception regardless of the correctness of your syntax.

execute will throw an exception if you have a syntax error or if the requested table / column does not exist. If everything is in order, an exception is not excluded, but you also will not get any results, which is not an evil thing.

+4
source

In the latest version, hive 2.0 comes with the hplsql tool, which allows us to check bush commands without actually launching them.

Configuration: add below XML to hive / conf folder and restart bush

https://github.com/apache/hive/blob/master/hplsql/src/main/resources/hplsql-site.xml

To start hplsql and test the query, use the following command: To test the Singe query

hplsql -offline -trace -e 'select * from sample'

(or) To check the entire file

hplsql -offline -trace -f samplehql.sql

If the query syntax is correct, the response from hplsql will be something like this:

 Ln:1 SELECT // type Ln:1 select * from sample // command Ln:1 Not executed - offline mode set // execution status 

if the request syntax is incorrect, a problem with the syntax in the request will be reported

If the hive version is older, we need to manually place the hplsql jars inside hive / lib and continue.

0
source

Source: https://habr.com/ru/post/1381531/


All Articles