Create a hive table from a tab-delimited file in s3 using interactive mode

Question

Create a hive table from a tab-delimited file in s3 using interactive mode

I loaded the partitioned files into S3, which are with this type of folder under the bucket: bucket → se → y = 2013 → m = 07 → d = 14 → h = 00

each subfolder has 1 file, which is displayed per hour of my traffic.

Then I created an EMR workflow to work interactively with the hive.

When I log in with the master and get into the bush, I run this command:

CREATE EXTERNAL TABLE se ( id bigint, oc_date timestamp) partitioned by (y string, m string, d string, h string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 's3://bi_data';

I get this error message:

FAILED: metadata error: java.lang.IllegalArgumentException: bucket name parameter must be specified when listing objects in the bucket
FAILED: Runtime error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

Does anyone help?

UPDATE Even if I try to use only string fields, I get the same error. Create a table with rows:

 CREATE EXTERNAL TABLE se ( id string, oc_date string) partitioned by (y string, m string, d string, h string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 's3://bi_data';

Version for the hive 0.8.1.8

+6

amazon-s3 amazon-web-services elastic-map-reduce hive

Gluz Jul 14 '13 at 13:33

source share

1 answer

Gluz · Accepted Answer · 2013-08-04T13:56:38+0000

So the solution is that I had two errors:

When writing only the bucket name, you must have a trailing slash in the S3 path. link here
Underscore is also a problem; the bucket name must match DNS.

Hope I helped someone with this.

Create a hive table from a tab-delimited file in s3 using interactive mode

More articles: