Create a hive table from a tab-delimited file in s3 using interactive mode

I loaded the partitioned files into S3, which are with this type of folder under the bucket: bucket โ†’ se โ†’ y = 2013 โ†’ m = 07 โ†’ d = 14 โ†’ h = 00

each subfolder has 1 file, which is displayed per hour of my traffic.

Then I created an EMR workflow to work interactively with the hive.

When I log in with the master and get into the bush, I run this command:

CREATE EXTERNAL TABLE se ( id bigint, oc_date timestamp) partitioned by (y string, m string, d string, h string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 's3://bi_data'; 

I get this error message:

FAILED: metadata error: java.lang.IllegalArgumentException: bucket name parameter must be specified when listing objects in the bucket

FAILED: Runtime error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

Does anyone help?

UPDATE Even if I try to use only string fields, I get the same error. Create a table with rows:

 CREATE EXTERNAL TABLE se ( id string, oc_date string) partitioned by (y string, m string, d string, h string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 's3://bi_data'; 

Version for the hive 0.8.1.8

+6
source share
1 answer

So the solution is that I had two errors:

  • When writing only the bucket name, you must have a trailing slash in the S3 path. link here

  • Underscore is also a problem; the bucket name must match DNS.

Hope I helped someone with this.

+13
source

Source: https://habr.com/ru/post/949409/


All Articles