Perl parse String with one or more fields

I have a string that I need to parse. It meets the following requirements:

  • It consists of 0 or more key-> value pairs.
  • The key is always two letters.
  • A value is one or more numbers.
  • There will be no space between the key and value.
  • There may or may not be a gap between individual pairs.

Examples of lines that I can see:

  • AB1234 // One key-> value pair (Key = AB, Value = 1234)
  • AB1234 BC2345 // Two key-> value pairs, separated by a space
  • AB1234BC2345 // Two key-> value pairs not separated by space
  • // Empty Sting, No key-> values ​​pairs
  • AB12345601BC1234CD1232PE2343 // Many pairs of key-> value, space
  • AB12345601 BC1234 CD1232 PE2343 // Many key-> value pairs with spaces

I need to build a Perl hash of this line. If I could guarantee that it was 1 pair, I would do something like this:

$string =~ /([AZ][AZ])([0-9]+)/ $key = $1 $value = $2 $hash{$key} = $value 

For several lines, I can potentially do something where, after each match of the above expression, I take a substring of the original string (excluding the first match), and then search again. However, I'm sure there is a smarter, perl-esque way to achieve this.

Desiring that I don’t have such a crappy data source that I could handle -

Jonathan

+4
source share
3 answers

In the context of a list with a global flag, the regex will return all matched substrings :

 use Data::Dumper; @strs = ( 'AB1234', 'AB1234 BC2345', 'AB1234BC2345', '', 'AB12345601BC1234CD1232PE2343', 'AB12345601 BC1234 CD1232 PE2343' ); for $str (@strs) { # The money line %parts = ($str =~ /([AZ][AZ])(\d+)/g); print Dumper(\%parts); } 

For greater opacity, remove the parentheses around the pattern matching: %parts = $str =~ /([AZ][AZ])(\d+)/g; .

+8
source

You are already here:

 $hash{$1} = $2 while $string =~ /([[:alpha:]]{2})([0-9]+)/g 
+3
source

Assuming your lines definitely match your pattern (i.e. there won't be any lines of form A122 or ABC123 ), then this should work:

 my @strings = ( 'AB1234', 'AB1234 BC2345', 'AB1234BC2345' ); foreach my $string (@strings) { $string =~ s/\s+//g; my ( $first, %elems ) = split(/([AZ]{2})/, $string); while (my ($key,$value) = each %elems) { delete $elems{$key} unless $key =~ /^[AZ]{2}$/; delete $elems{$key} unless $value =~ /^\d{4}$/; } print Dumper \%elems; } 
0
source

Source: https://habr.com/ru/post/1383135/


All Articles