Get powershell regular expression in table

I am trying to extract a dataset from some (large) text files. Basically, each line looks something like this:

2011-12-09 18:20:55, ABC.EXE[3b78], The rest of the line... 

I would like to get the date and bit between curly braces (process id) and then compile the table. The second stage of the task is to group this table so that I get the earliest date for each process identifier, in fact giving me the date and time of the first log entry to one process identifier, which, we hope, will approach the start time of this process instance.

What I have so far (split on another line for readability)

 gci -filter *.log -r | select-string '(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}), ABC.EXE\[(.{4})' | % { $_.matches } | % { $_.groups } | % { $_.value } 

spills out grips. I would like to ignore the first capture and combine the second and third in one line.

Help? You are welcome?

Edit: DOH! I can not answer my question. So that...

OK, I think I'm on the right track. The SO here question helped me get the individual parts that I wanted, namely:

 $_.matches[0].groups[1].value, $_.matches[0].groups[2].value 

Then, the MSDN article here shows how to “group” bits into an object, which allows you to group / sort / manipulate. Final result

 gci -filter *.log | select-string '(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}), ABC.EXE\[(.{4})' | % { new-object object | add-member NoteProperty Name $_.matches[0].groups[1].value -passthru | add-member NoteProperty PId $_.matches[0].groups[2].value -passthru } 

Pretty dirty, so if anyone knows a cleaner way to do this, please let me know.

+4
source share
1 answer

You can create new objects easier in PowerShell v2, where the New-Object supports the -Property parameter, which gets a hash table of properties:

 New-Object PSObject -Property @{ Name = $_.matches[0].groups[1].value PId = $_.matches[0].groups[2].value } 

As a rule, I did the processing a little differently, but:

 # prepare table $data = $(switch -Regex -File filename { '^[^,]+' { $date = [datetime]$Matches[0] } '(?<=\[)[^\]]+' { $id = $Matches[0] } '$' { New-Object PSObject -Property @{ Date = $date PId = $id } } }) 

Using switch -regex has become a nice way (at least for me) to make fast and dirty parsers for text data. With -Regex all relevant cases will be executed, in this case everything (so it's just a convenience to separate the different parts of the match). The first takes the date and time and stores it in a variable (even as a DateTime value); the second gets the process identifier, and the third, corresponding at the end of the line, puts everything together.

Just a personal preference; I have never used Select-String .

 $data | group PId | foreach { New-Object PSObject -Property @{ PId = $_.Name MinDate = @($_.Group | sort Date)[0].Date } } 

Then it uses only the compiled data, groups them by the process ID and displays an identifier with a minimum date for each.

Note that this is a more “looks beautiful in code” approach. If the files you are dealing with are really large, you probably want something more efficient.

+4
source

Source: https://habr.com/ru/post/1385800/


All Articles