It seems like a run command is needed on the substitute command, so the replacement text is considered a piece of Perl code:
$str =~ s/((.)\2+)/$2 . length($1)/ge;
Script
#!/usr/bin/env perl use strict; use warnings; my $original = "aaabbcccdddd"; my $alternative = "aaabbcccddddeffghhhhhhhhhhhh"; sub proc1 { my($str) = @_; $str =~ s/(.)\1+/$1/g; print "$str\n"; } proc1 $original; proc1 $alternative; sub proc2 { my($str) = @_; $str =~ s/((.)\2+)/$2 . length($1)/ge; print "$str\n"; } proc2 $original; proc2 $alternative;
Exit
abcd abcdefgh a3b2c3d4 a3b2c3d4ef2gh12
Could you break the regex to explain how it works?
I guess this is part of the match, which is a problematic and not a substitute part.
Original regex:
(.)\1+
This captures one character (.)
, Followed by the same character repeated one or more times.
The revised regex is the same, but also captures the entire pattern:
((.)\2+)
The first open bracket starts a general capture; a second open bracket starts capturing a single character. But now this is the second capture, so \1
in the original should be \2
in the wording.
Since the search captures the entire string of duplicate characters, replacement can easily determine the length of the pattern.
source share