How to define nested collection items in Hive

I am trying to create a beehive table with nested collection items. Suppose I have a struct array.

CREATE TABLE SAMPLE( record array<struct<col1:string,col2:string>> )row format delimited fields terminated by ',' collection items terminated by '|'; 

The first level, the delimiter ',' will override the default delimiter '^ A'.

Second level, delimiter '|' will overlap the second-level delimiter '^ B' by default to highlight an external structure (i.e. an array).

The third level hive will use the default third level delimiter '^ C' as the delimiter for Struct

Now my question is how to define a separator for the second level (i.e. Struct), because the character "C" is difficult to read and also generate.

Is there a way to explicitly define a delimiter instead of ^ C?

Thanks in advance.

+6
source share
1 answer

Try something like this:

 CREATE TABLE SAMPLE( id BIGINT, record array<struct<col1:string,col2:string>> )row format delimited fields terminated by ',' collection items terminated by '|' map keys terminated by ':'; 

Now the data in the text file will look like this:

 1345653,110909316904:1341894546|221065796761:1341887508 

Then you can query it like:

 select record.col1 from SAMPLE; 
+10
source

Source: https://habr.com/ru/post/950879/


All Articles