Perl script to search for a pattern and concat lines in a file

I have a text file (basically an error log with date, timestamp and some data) in the following template:

mm/dd/yy 12:00:00:0001  
This is line 1
This is line 2

mm/dd/yy 12:00:00:0004  
This is line 3
This is line 4
This is line 5


mm/dd/yy 12:00:00:0004
This is line 6
This is line 7

I am new to Perl and I need to write a script that looks for a file for timestamps and combines data that has the same timestamp in it.

I expect the following output for the above sample.

mm/dd/yy 12:00:00:0001  
This is line 1
This is line 2

mm/dd/yy 12:00:00:0004  
This is line 3
This is line 4
This is line 5
This is line 6
This is line 7

What is the best way to do this?

+3
source share
4 answers

I had to complete this task earlier on some very large files, and the timestamps didn't come in order. I did not want to keep all this in mind. I completed the task using a three-pass solution:

  • temp
  • , sort (1)

, , , , , - , .

use strict;
use warnings;
use File::Temp qw(tempfile);

my( $temp_fh, $temp_filename )  = tempfile( UNLINK => 1 );

# read each line, tag with timestamp, and write to temp file
# will sort and undo later.
my $current_timestamp = '';
LINE: while( <DATA> )
    {
    chomp;

    if( m|^\d\d/\d\d/\d\d \d\d:\d\d:\d\d:\d\d\d\d$| ) # timestamp line
        {
        $current_timestamp = $_;
        next LINE;
        }
    elsif( m|\S| ) # line with non-whitespace (not a "blank line")
        {
        print $temp_fh "[$current_timestamp] $_\n";
        }
    else # blank lines
        {
        next LINE;
        }
    }

close $temp_fh;

# sort the file by lines using some very fast sorter
system( "sort", qw(-o sorted.txt), $temp_filename );

# read the sorted file and turn back into starting format
open my($in), "<", 'sorted.txt' or die "Could not read sorted.txt: $!";

$current_timestamp = '';
while( <$in> )
    {
    my( $timestamp, $line ) = m/\[(.*?)] (.*)/;
    if( $timestamp ne $current_timestamp )
        {
        $current_timestamp = $timestamp;
        print $/, $timestamp, $/;
        }

    print $line, $/;
    }

unlink $temp_file, 'sorted.txt';

__END__
01/01/70 12:00:00:0004
This is line 3
This is line 4
This is line 5

01/01/70 12:00:00:0001
This is line 1
This is line 2


01/01/70 12:00:00:0004
This is line 6
This is line 7
+4

, = > . - :

my %h;
my $cur = "*** No date ***";
while(<>) {
  if (m"^(\d\d/\d\d/\d\d \d\d:\d\d:\d\d:\d{4})") {
    $cur = $1;
  } else {
    $h{$cur} .= $_ unless /^\s*$/;
  }
}

print "$_\n$h{$_}\n" foreach (sort keys %h);

t.pl perl t.pl < yourlog.txt. .

+2

, : SQLite (, , ). , .

+1

...

    #!/usr/bin/perl

    use strict;

    my (%time, $id);
    while (<DATA>) {
        if ( /^mm/ ... /\n\n/ ) {
            chomp;
            s/^mm\/dd\/yy\s(.*)// and $id = $1;
            next if ( /^mm/ || /^$/ );
            push (@{$time{$id}}, $_);
       }

}

for my $i ( keys %time ) {
    print "mm/dd/yy $i\n";
    for my $j ( @{$time{$i}} ) {
        print "$j\n";
    }
    print "\n";
}

__DATA__
mm/dd/yy 12:00:00:0001
This is line 1
This is line 2

mm/dd/yy 12:00:00:0004
This is line 3
This is line 4
This is line 5


mm/dd/yy 12:00:00:0004
This is line 6
This is line 7
0

Source: https://habr.com/ru/post/1710374/


All Articles