Perl6 How to get all lines that do not indent by the width of spaces?

Question

Perl6 How to get all lines that do not indent by the width of spaces?

I have a very, very large text file that I'm working on, has lines with different sizes of indentation. These valid lines have an indentation width of 12 characters, which is created by a combination of tabs and spaces. Now I want to get all lines that do not have a 12-character indent width, and these lines have from 0 to 11 characters the width of the indents from combinations of tabs and spaces.

if $badLine !~~ m/ ^^ [\s ** 12 ||
                      \t \s ** 4 ||
                      \s \t \s ** 3 ] / { say $badLine; }

But the problem is that when you work on a text file with a word processor, pressing the tab key can give you anywhere from 0 to 8 char -width spaces to fill in the gap. What would be a reasonable way to get all these inappropriate lines that didn't have 12-char -width indentation?

Thank.

+4

width match perl6 tabs space

lisprogtor Jan 28 '17 at 7:50

source share

2 answers

, ( ), :

# some test input
my \INPUT = qq:to/EOI/;
           11s
            12s
             13s
\t    1t 4s
 \t   1s 1t 3s
    4s
   \t    3s 1t 4s
        \t8s 1t
EOI

# compute indentation width
sub indent-width($_) {
    my $n = 0;

    # iterate over characters
    for .comb {
        # tabs only take enough space to fill an octet
        when "\t" { $n += 8 - $n % 8 }
        default { ++$n }
    }
    $n;
}

# generate output, see below
say ?/^ :r (\h+) <?{ indent-width(~$0) == 12 }> /, " {.trim}"
    for INPUT.lines;

/^ :r (\h+) <?{ indent-width(~$0) == 12 }> /

, <?{...}>, , $0 12.

, :r, regex : 12 .

+4

Christoph 28 . '17 12:00

smls · Accepted Answer · 2017-01-28T14:35:48+0000

Width 12

For an indentation width of 12, assuming the tab stops at positions 0, 8, 16, etc .:

for $input.lines {
    .say if not /
        ^                             # start of line
        [" " ** 8 || " " ** 0..7 \t]  # whitespace up to first tab stop
        [" " ** 4]                    # whitespace up to position 12
        [\S | $]                      # non-space character or end of line
    /;
}

Explanation:

To go from the beginning of the line (position 0) to the first tab tab (position 8), there are two possibilities that we need to match:
- 8 spaces.
- 0 to 7 spaces, and then 1 tab. (The tab goes straight to the tab stack, so that it fills any width after spaces.)
The only way to get from stopping the tab (position 8) to the indentation goal (position 12) is to use 4 spaces. (The tab will move through the target to the next tab stop at position 16.)
, , , .

named token, :

my token indent ($width) {
    [" " ** 8 || " " ** 0..7 \t] ** {$width div 8}
     " " ** {$width % 8}
}

.say if not /^ <indent(12)> [\S | $]/ for $input.lines;

:

, , , , , . ($width div 8 , div - ).
, , . ($width % 8 , % modulo.)

, (, ). , , :

my token indent ($width) {  
    :my ($before-first-stop, $numer-of-stops, $after-last-stop);
    {
        $before-first-stop = min $width, 8 - $/.from % 8;
        $numer-of-stops    = ($width - $before-first-stop) div 8;
        $after-last-stop   = ($width - $before-first-stop) % 8;
    }
    [" " ** {$before-first-stop} || " " ** {^$before-first-stop} \t]
    [" " ** 8 || " " ** 0..7 \t] ** {$numer-of-stops}
     " " ** {$after-last-stop}
}

:

, , , , , , .
$/.from; - .
( ) , .

Perl6 How to get all lines that do not indent by the width of spaces?

Width 12

More articles: