Beehive: paste rewrite multiple sections

I have a Hive table divided by date. I want to be able to selectively overwrite partitions for the last "n" days (or a custom list of partitions).

Is there a way to do this without writing an "INSERT OVERWRITE DIRECTORY" statement for each section?

Any help is greatly appreciated.

+4
source share
1 answer

Hive supports dynamic partitioning, so you can build a query in which a partition is only one of the source fields.

INSERT OVERWRITE TABLE dst partition (dt) SELECT col0, col1, ... coln, dt from src where ... 

The where clause may indicate which dt values ​​you want to overwrite.

Just specify the section field (dt in this case), the last in the list from the source, you can even make SELECT *, dt if the dt field is already part of the source or even SELECT *,my_udf(dt) as dt , etc.

By default, Hive wants at least one of these partitions to be static, but you can let it be non-strict; therefore, for the above query, you can set the following before running:

 set hive.exec.dynamic.partition.mode=nonstrict; 
+14
source

Source: https://habr.com/ru/post/1501021/


All Articles