String sequential grouping in Oracle

My task is to group similar digits / characters in a given string, For example, for example: SQL output for line 4455599 should be 44 555 99 and works with the following query:

 with t(str) as ( select '4455599' from dual ) select listagg(str_grouped ,' ') within group (order by rownum) str_split from( select listagg ( str) within group ( order by lvl) str_grouped from( select level lvl, substr(str,level,1) str, dense_rank() over( order by substr(str,level,1)) drank_no from t connect by level <= length(str) ) group by drank_no ); 

But the request failed with the following data, as I am currently using dense_rank .

445559944 , expected 44 555 99 44 , but receiving 4444 555 99 .

bb119911 , expected bb 11 99 11 but receiving 1111 99 bb .

Help me with this, welcome all regex requests.

+6
source share
1 answer

Backreferences for salvation:

 select regexp_replace('4455599', '((.)\2*)', '\1 ') from dual; 

Output:

 44 555 99 

Explanation

((.)\2*) defines two capture groups, where:

(.) matches any single character and commits it to group 2.

\2* is a backward reference to a character recorded in group 2 that matches the same character zero or more times.

((.)\2*) therefore corresponds to a sequence of one or more of the same character and captures the sequence in group 1.

\1 replaces characters matching the contents of group 1 followed by a space.

Backreferences are counted from left to right, starting at 1 (group 0 is a complete match). So, if you have a pattern (((a)b)c)d , the innermost (a) is group 3, ((a)b) is group 2, (((a)b)c) is a group 1, and if you use the regular regex engine (not the oracle), the whole pattern (((a)b)c)d is fixed in group 0.

Test cases

 select val, regexp_replace(val, '((.)\2*)', '\1 ') as result from ( select '445559944' as val from dual union all select 'bb119911' as val from dual union all select '46455599464' as val from dual ) foo; 

Output:

 VAL RESULT ----------- ------------------ 445559944 44 555 99 44 bb119911 bb 11 99 11 46455599464 4 6 4 555 99 4 6 4 
+9
source

Source: https://habr.com/ru/post/987473/


All Articles