Is there something like the zip () function in PostgreSQL that combines two arrays?

I have two values ​​of an array of the same length in PostgreSQL:

{a,b,c} and {d,e,f}

and I would like to combine them into

{{a,d},{b,e},{c,f}}

Is there any way to do this?

+16
arrays sql postgresql
Sep 13
source share
2 answers

Postgres 9.3 and later

Simple zip ()

Consider the following demo for Postgres 9.3 or earlier :

 SELECT ARRAY[a,b] AS ab FROM ( SELECT unnest('{a,b,c}'::text[]) AS a ,unnest('{d,e,f}'::text[]) AS b ) x; 

Result:

  ab ------- {a,d} {b,e} {c,f} 

Note that both arrays must have the same number of elements for parallel printing, or instead you get a cross-connection.

You can wrap this in a function if you want:

 CREATE OR REPLACE FUNCTION zip(anyarray, anyarray) RETURNS SETOF anyarray LANGUAGE SQL AS $func$ SELECT ARRAY[a,b] FROM (SELECT unnest($1) AS a, unnest($2) AS b) x; $func$; 

Call:

 SELECT zip('{a,b,c}'::text[],'{d,e,f}'::text[]); 

The same result.

zip () for a multidimensional array:

Now, if you want to fill this new set of arrays into a single 2-dimenstional array, it becomes more complex.

 SELECT ARRAY (SELECT ...) 

or

 SELECT array_agg(ARRAY[a,b]) AS ab FROM ( SELECT unnest('{a,b,c}'::text[]) AS a ,unnest('{d,e,f}'::text[]) AS b ) x 

or

 SELECT array_agg(ARRAY[ARRAY[a,b]]) AS ab FROM ... 

will result in the same error message (checked with pg 9.1.5):

ERROR: could not find array type for data type text []

But there is a way around this, as we developed into this close question .
Create a custom aggregation function:

 CREATE AGGREGATE array_agg_mult (anyarray) ( SFUNC = array_cat ,STYPE = anyarray ,INITCOND = '{}' ); 

And use it as follows:

 SELECT array_agg_mult(ARRAY[ARRAY[a,b]]) AS ab FROM ( SELECT unnest('{a,b,c}'::text[]) AS a ,unnest('{d,e,f}'::text[]) AS b ) x 

Result:

 {{a,d},{b,e},{c,f}} 

Pay attention to the additional layer ARRAY[] ! Without it, and simply:

 SELECT array_agg_mult(ARRAY[a,b]) AS ab FROM ... 

You get:

 {a,d,b,e,c,f} 

What may be useful for other purposes.

Reset another function:

 CREATE OR REPLACE FUNCTION zip2(anyarray, anyarray) RETURNS SETOF anyarray LANGUAGE SQL AS $func$ SELECT array_agg_mult(ARRAY[ARRAY[a,b]]) FROM (SELECT unnest($1) AS a, unnest($2) AS b) x; $func$; 

Call:

 SELECT zip2('{a,b,c}'::text[],'{d,e,f}'::text[]); -- or any other array type 

Result:

 {{a,d},{b,e},{c,f}} 

Postgres 9.4 +

Use the ROWS FROM construct or the updated unnest() , which parallelizes multiple arrays. Each of them can have a different length. You get ( for the documentation ):

[...] the number of rows of the result in this case is the number of the largest function, the result with smaller results, supplemented by zero values ​​that will correspond.

Use this cleaner and simpler option:

 SELECT ARRAY[a,b] AS ab FROM unnest('{a,b,c}'::text[] , '{d,e,f}'::text[]) x(a,b); 

Postgres 9.5 +

sends array_agg(array expression) :

 Function Argument Type(s) Return Type array_agg(expression) any array type same as argument data type Description input arrays concatenated into array of one higher dimension (inputs must all have same dimensionality, and cannot be empty or NULL) 

This is a replacement for my custom aggregate function array_agg_mult() , implemented in C, which is much faster. Use it.

+33
Sep 13 '12 at 21:14
source share

Here's another approach, safe for arrays of different lengths, using the multi-aggregation array mentioned by Erwin:

 CREATE OR REPLACE FUNCTION zip(array1 anyarray, array2 anyarray) RETURNS text[] AS $$ SELECT array_agg_mult(ARRAY[ARRAY[array1[i],array2[i]]]) FROM generate_subscripts( CASE WHEN array_length(array1,1) >= array_length(array2,1) THEN array1 ELSE array2 END, 1 ) AS subscripts(i) $$ LANGUAGE sql; regress=> SELECT zip('{a,b,c}'::text[],'{d,e,f}'::text[]); zip --------------------- {{a,d},{b,e},{c,f}} (1 row) regress=> SELECT zip('{a,b,c}'::text[],'{d,e,f,g}'::text[]); zip ------------------------------ {{a,d},{b,e},{c,f},{NULL,g}} (1 row) regress=> SELECT zip('{a,b,c,z}'::text[],'{d,e,f}'::text[]); zip ------------------------------ {{a,d},{b,e},{c,f},{z,NULL}} (1 row) 

If you want to cancel the excess, not the zero space, just change the length test >= to <= .

This function does not handle the rather strange PostgreSQL function, in which arrays can have a pointer element other than 1, but in practice no one actually uses this function. For example, with a 3-element array with a zero index:

 regress=> SELECT zip('{a,b,c}'::text[], array_fill('z'::text, ARRAY[3], ARRAY[0])); zip ------------------------ {{a,z},{b,z},{c,NULL}} (1 row) 

since Erwin code works with such arrays and even with multidimensional arrays (by smoothing them), but does not work with arrays of different lengths.

Arrays are a bit special in PostgreSQL, they are a little flexible with multidimensional arrays, a custom index of origin, etc.

In 9.4 you can write:

 SELECT array_agg_mult(ARRAY[ARRAY[a,b]) FROM unnest(array1) WITH ORDINALITY as (o,a) NATURAL FULL OUTER JOIN unnest(array2) WITH ORDINALITY as (o,b); 

which will be much nicer, especially if the optimization is for scanning functions together, and not for sorting and combining.

+6
Jul 29 '13 at 6:03
source share



All Articles