Regular expression for checking patch length field with packed space

Let's say I have a text file for parsing that contains some content with a fixed length:

123jackysee        45678887
456charliewong     32145644
<3><------16------><--8---> # Not part of the data.

The first three characters are the identifier, then the username is 16 characters, then the 8-digit phone number.

I would like to write a regex to match and validate the input for every line I come to:

(\d{3})([A-Za-z ]{16})(\d{8})

Username must be between 8-16 characters. But ([A-Za-z ]{16})will also match a null value or a space. I think about ([A-Za-z]{8,16} {0,8}), but he would have discovered more than 16 characters. Any suggestions?

+3
source share
7 answers

No no no no!: -)

, RE SQL?

, - :

  • , 27.
  • (0-2, 3-18, 19-26).
  • , "\d{3}".
  • , "[A-Za-z]{8,} *".
  • , "\d{8}".

, , isValidLine() .

- :

def isValidLine(s):
    if s.len() != 27 return false
    return s.match("^\d{3}[A-za-z]{8,} *\d{8}$"):

, , Python, PaxLang, . , , , 27, - RE.

16 - , RE. RE , , .

RE - :

^\d{3}(([A-za-z]{8} {8})
      |([A-za-z]{9} {7})
      |([A-za-z]{10} {6})
      |([A-za-z]{11} {5})
      |([A-za-z]{12}    )
      |([A-za-z]{13}   )
      |([A-za-z]{14}  )
      |([A-za-z]{15} )
      |([A-za-z]{16}))
      \d{8}$

, , RE:

^\d{3}[A-za-z]{8,} *\d{8}$
^.{27}$

, isValidLine() .

+7

, relx perl, '_' :

perl -ne 'exit 1 unless /(\d{3})(\w{8,16})\s+(\d{8})/ && length == 28'
0

... Regex, , :

(?P<id>\d{3})(?=[A-Za-z\s]{16}\d)(?P<username>[A-Za-z]{8,16})\s*(?P<phone>\d{8})

. 100% , , char - , .

, . RegEx, .

RegEx a) b) , .

EDIT:

, , (, , Pax!) , RegEx:

(?P<id>\d{3})

- "id" - . RegEx . . RegEx - http://www.regular-expressions.info/named.html .

(?=[A-Za-z\s]{16}\d)

? = . true, . lookahead , . RegEx Lookahead. . http://www.regular-expressions.info/lookaround.html.

(?P<username>[A-Za-z]{8,16})\s*

, . , . " " , , .

,

(?P<phone>\d{8})

.

, - RegEx lookahead, .

, Regex . Regex -.

Regex a ^ $, , , .

0

, , :

(\d{3})([A-Za-z]{3,16} {0,13})(\d{8})

, , whitespace, - . , , .

0

@OP, . . , , - . . Python.

import sys
for line in open("file"):
    line=line.strip()
    # check first 3 char for digit
    if not line[0:3].isdigit(): sys.exit()
    # check length of username.
    if len(line[3:18]) <8 or len(line[3:18]) > 16: sys.exit()
    # check phone number length and whether they are digits.
    if len(line[19:26]) == 8 and not line[19:26].isdigit(): sys.exit()
    print line
0

, . :

#!/usr/bin/perl

use strict;
use warnings;

while ( <DATA> ) {
    chomp;
    last unless /\S/;
    my @fields = split;
    if (
        ( my ($id, $name) = $fields[0] =~ /^([0-9]{3})([A-Za-z]{8,16})$/ )
            and ( my ($phone) = $fields[1] =~ /^([0-9]{8})$/ )
    ) {
        print "ID=$id\nNAME=$name\nPHONE=$phone\n";
    }
    else {
        warn "Invalid line: $_\n";
    }
}

__DATA__
123jackysee       45678887
456charliewong    32145644
678sdjkfhsdjhksadkjfhsdjjh 12345678

:

#!/usr/bin/perl

use strict;
use warnings;

while ( <DATA> ) {
    chomp;
    last unless /\S/;
    my ($id, $name, $phone) = unpack 'A3A16A8';
    if ( is_valid_id($id)
            and is_valid_name($name)
            and is_valid_phone($phone)
    ) {
        print "ID=$id\nNAME=$name\nPHONE=$phone\n";
    }
    else {
        warn "Invalid line: $_\n";
    }
}

sub is_valid_id    { ($_[0]) = ($_[0] =~ /^([0-9]{3})$/) }

sub is_valid_name  { ($_[0]) = ($_[0] =~ /^([A-Za-z]{8,16})\s*$/) }

sub is_valid_phone { ($_[0]) = ($_[0] =~ /^([0-9]{8})$/) }

__DATA__
123jackysee        45678887
456charliewong     32145644
678sdjkfhsdjhksadkjfhsdjjh 12345678

:

#!/usr/bin/perl

use strict;
use warnings;

my %validators = (
    id    => make_validator( qr/^([0-9]{3})$/ ),
    name  => make_validator( qr/^([A-Za-z]{8,16})\s*$/ ),
    phone => make_validator( qr/^([0-9]{8})$/ ),
);

INPUT:
while ( <DATA> ) {
    chomp;
    last unless /\S/;
    my %fields;
    @fields{qw(id name phone)} = unpack 'A3A16A8';

    for my $field ( keys %fields ) {
        unless ( $validators{$field}->($fields{$field}) ) {
            warn "Invalid line: $_\n";
            next INPUT;
        }
    }

    print "$_ : $fields{$_}\n" for qw(id name phone);
}

sub make_validator {
    my ($re) = @_;
    return sub { ($_[0]) = ($_[0] =~ $re) };
}

__DATA__
123jackysee        45678887
456charliewong     32145644
678sdjkfhsdjhksadkjfhsdjjh 12345678
0

You can use lookahead: ^(\d{3})((?=[a-zA-Z]{8,})([a-zA-Z ]{16}))(\d{8})$

Testing:

    123jackysee 45678887 Match
    456charliewong 32145644 Match
    789jop 12345678 No Match - username too short
    999abcdefghijabcde12345678 No Match - username 'column' is less that 16 characters
    999abcdefghijabcdef12345678 Match
    999abcdefghijabcdefg12345678 No Match - username column more that 16 characters
0
source

Source: https://habr.com/ru/post/1714838/


All Articles