Using a regular expression to extract a matching pattern from a string and assign it to a variable using perl

Question

Using a regular expression to extract a matching pattern from a string and assign it to a variable using perl

I am looking for advice on extracting a section of a string that always appears as first instance data between brackets using perl and regex and assigns this value to a variable.

Here is the exact situation. I use perl and regex to extract the course identifier from the university directory and assign it to a variable. Please note the following:

BIO-2109-01 (12345) Introduction to Biology
CHM-3501-F2-01 (54321) Introduction to Chemistry
IDS-3250-01 (98765) US History (1860-2000)
SPN-1234-02-F1 (45678) History of Spain (1900-2010)

A typical format is [course name] [[course identifier]] [course name]

My goal is to create a script that can take each record, one at a time, assign it to a variable, and then use a regular expression to extract only the course identifier and assign the CourseID variable only to the variable.

My approach was to use search and replace to replace anything that doesn't match this with ``, and then save the remaining (course identifier) variable. Here are some examples of what I have tried:

$string = "BIO-2109-01 (12345) Introduction to Biology";
($courseID = $string) =~ s/[^\d\d\d\d\d]//g;
print $courseID;

Result: 21090112345 --- print the name of the course section and the course identifier

$string = "BIO-2109-01 (12345) Introduction to Biology";
$($courseID = $string) =~ s/[^\b\(\d{5}\)]\b//g;
print $courseID;

Result: 210901 (12345) --- print the name of the course section, parens and courseID

So I was not lucky with the search and replace - however, I found this nugget:

\(([^\)]+)\)

http://regexr.com/, parens. , , , , (abc).

, - :

$string = "BIO-2109-01 (12345) Introduction to Biology";
($courseID = $string) =~ [magicRegex_goes_here];
print courseID;

12345

, :

$string = IDS-3250-01 (98765) History of US (1860-2000)
($courseID = $string) =~ [magicRegex_goes_here];
print courseID;

98765

. , , . , , , .

UPDATE

use warnings 'all';
use strict;
use feature 'say';

my $file = './data/enrollment.csv';      #File this script generates
my $course = "";                         #Complete course string [name-of-course] [(courseID)] [course_name]
my @arrayCourses = "";                   #Array of courseIDs
my $i = "";                              #i in for loop
my $courseID = "";                       #Extracted course ID
my $userName = "";                       #Username of person we are enrolling
my $action = "add,";                     #What we are doing to user
my $permission = "teacher,";             #What permissions to assign to user
my $stringToPrint = "";                  #Concatinated string to write to file
my $n = "\n";                            #\n
my $c = ",";                             #,

#BEGIN PROGRAM

print "Enter the username \n";

chomp($userName = <STDIN>);               #Get the enrollee username from user

print "\n";

print "Enter course name and press enter.  Enter 'x' to end. \n";  #prompt for course names

while ($course ne 'x') {
        chomp($course = <STDIN>);
        if ($course ne "x") {
                if (($courseID) = ($course =~ /[^(]+\(([^)]+)\)/) ) {     #nasty regex to extract courseID - thnx PerlDuck and zdim
                        push @arrayCourses, $courseID;                    #put the courseID into array
                }
                else {
                        print "Cannot process last entry check it";
                }
        }
        else {
                last;
        }
}

shift @arrayCourses;                      #Remove first entry from array - add,teacher,,username

open(my $fh,'>', $file);                  #open file

for $i (@arrayCourses)                    #write array to file
{
        $stringToPrint= join "", $action, $permission, $i, $c, $userName, $n ;
        print $fh $stringToPrint;
}

close $fh;

! ! @PerlDuck @zdim

+4

string scripting regex perl

squadguy 26 . '16 19:56

2

zdim · Answer 1 · 2016-10-26T20:04:02+0000

my ($section, $id, $name) = 
    $string =~ /^\s* ([^(]+) \(\s* ([^)]+) \)\s* (.+) $/x;

[^...], , , , ^ ( "" ). , [], .

, (, (, ( ) . , ), , ( ). \( ... \), ( ), , . , (.+), , + . , . , ( ) .

/x ( ) , reaadbility. match , . , , ( ) . . (perlretut).

, ,

use warnings 'all';
use strict;
use feature 'say';

my $file = 'catalog.txt';

open my $fh, '<', $file or die "Can't open $file: $!";

while (my $line = <$fh>) 
{
    next if $line =~ /^\s*$/;  # skip empty lines

    # Strip leading and trailing white space
    $line =~ s{^\s*|\s*$}{}g;

    my ($section, $id, $name) = 
        $line =~ /^ ([^(]+) \(\s* ([^)]+) \)\s* (.+) $/x
            or do {
                warn "Error with expected format -- ";
                next;
            };

    say "$section, $id, $name";
}
close $fh;

s{}{} , s/// , , .

. ( ) , , . . (perldsc).

. * ( - ), - , , . .+ , , - . , (.+) .

, ,

my ($id) = $line =~ / \(\s* ([^)]+) \) /x  or do { ... };

, - .

PerlDuck · Answer 2 · 2016-10-26T20:36:12+0000

#!/usr/bin/env perl

use strict;
use warnings;

while( my $line = <DATA> ) {
    if (my ($courseID) = ($line =~ /[^(]+\(([^)]+)\)/) ) {
        print "course-ID = $courseID; -- line was $line";
    }
}

__DATA__
BIO-2109-01 (12345) Introduction to Biology
CHM-3501-F2-01 (54321) Introduction to Chemistry
IDS-3250-01 (98765) History of US (1860-2000)
SPN-1234-02-F1 (45678) Spanish History (1900-2010)

:

course-ID = 12345; -- line was BIO-2109-01 (12345) Introduction to Biology
course-ID = 54321; -- line was CHM-3501-F2-01 (54321) Introduction to Chemistry
course-ID = 98765; -- line was IDS-3250-01 (98765) History of US (1860-2000)
course-ID = 45678; -- line was SPN-1234-02-F1 (45678) Spanish History (1900-2010)

, , /[^(]+\(([^)]+)\)/,

/ [^(]+     # 1 or more characters that are not a '('
  \(        # a literal '('. You must escape that because you don't want
            # to start it a capture group.
  ([^)]+)   # 1 or more chars that are not a ')'.
            # The sorrounding '(' and ')' capture this match
  \)        # a literal ')'
/x

/x , .

/x. :

while( my $line = <DATA> ) {
    if (my ($courseID) = ($line =~ / [^(]+   # …
                                     \(      # …
                                     ([^)]+) # …
                                     \)      # …
                                    /x ) ) {
        print "course-ID = $courseID; -- line was $line";
    }
}

, , :

my $pattern = 
    qr/ [^(]+     # 1 or more characters that are not a '('
        \(        # a literal '(' (you must escape it)
        ([^)]+)   # 1 or more chars that are not a ')'.
                  # The sorrounding '(' and ')' capture this match
        \)        # a literal ')'
      /x;

:

if (my ($courseID) = ($line =~ $pattern)) {
    …
}

Using a regular expression to extract a matching pattern from a string and assign it to a variable using perl

More articles: