Error using regexp_split_to_table (Amazon Redshift)

I have the same question like this:
Separating comma delimited fields in Postgresql and executing UNION ALL on all result tables
It’s just that my column β€œfruit” is separated by the symbol β€œ|”. When I try:

SELECT yourTable.ID, regexp_split_to_table(yourTable.fruits, E'|') AS split_fruits FROM yourTable 

I get the following:

 ERROR: type "e" does not exist 

Q1. What does E do? I saw some examples when E not used. Official documents do not explain this with the example of a "fast brown fox ...".

Q2. How to use '|' how is the delimiter for my request?

Edit: I am using PostgreSQL 8.0.2. unsest () and regexp_split_to_table () are both not supported.

+6
source share
1 answer

A1

E is a prefix for Posix-style escape lines. Usually you do not need this in modern Postgres. Just add it if you want to interpret special characters in a string. Like E'\n' for a newline char. Details and documentation links:

E is the useless noise in your request, but it should still work. The answer you are referring to is not very good, I'm afraid.

A2

It should work as it is. But better without E

 SELECT id, regexp_split_to_table(fruits, '|') AS split_fruits FROM tbl; 

For simple delimiters, you don't need expensive regular expressions. This is usually faster:

 SELECT id, unnest(string_to_array(fruits, '|')) AS split_fruits FROM tbl; 

In Postgres 9.3+, you prefer to use LATERAL for set-return functions:

 SELECT t.id, f.split_fruits FROM tbl t LEFT JOIN LATERAL unnest(string_to_array(fruits, '|')) AS f(split_fruits) ON true; 

More details:

Amazon Redshift is not Postgres

It implements only a reduced set of functions, as described in its manual . In particular, there are no tables, including the main functions unnest() , generate_series() or regexp_split_to_table() when working with its "compute nodes" (access to any tables).

First you need to use a standardized table layout (an extra table with one fruit per row).

Or here are some options for creating a rowset in Redshift:

This workaround should do this:

  • Create a table of numbers with at least as many rows as can be in your column. Temporary or permanent if you continue to use it. Say we never had more than 9:

     CREATE TEMP TABLE nr9(i int); INSERT INTO nr9(i) VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9); 
  • Join the number table and use split_part() , which is implemented in Redshift :

     SELECT *, split_part(t.fruits, '|', ni) As fruit FROM nr9 n JOIN tbl t ON split_part(t.fruits, '|', ni) <> '' 

Voila.

+9
source

Source: https://habr.com/ru/post/983617/


All Articles