Partition table query key by date / time

My hive table is divided by year, month, day, hour

Now I want to get data from 2014-05-27 to 2014-06-05 How can I do this?

I know that one option is to create a partition in the era (or yyyy-mm-dd-hh) and in the request time period. Can I do this without losing the date hierarchy?

Table structure

CREATE TABLE IF NOT EXISTS table1 (col1 int, col2 int)
PARTITIONED BY (year int, month int, day int, hour int) 
STORED AS TEXTFILE;
+4
source share
2 answers

This is a similar scenario that we encounter every day when querying tables in the hive. We divided our tables in the same way as you explained, and it helped a lot with the query. Here's how we break up:

CREATE TABLE IF NOT EXISTS table1 (col1 int, col2 int)
PARTITIONED BY (year bigint, month bigint, day bigint, hour int) 
STORED AS TEXTFILE;

For sections, we assign the following values:

year = 2014, month = 201409, day = 20140924, hour = 01

, , :

select * from table1 where day >= 20140527 and day < 20140605 

,

+8

  WHERE st_date > '2014-05-27-00' and end_date < '2014-06-05-24' 

, , a, , "2014-04-04" "2014-04-03".

, .

+2

Source: https://habr.com/ru/post/1546250/


All Articles