I am trying to extract a dataset from some (large) text files. Basically, each line looks something like this:
2011-12-09 18:20:55, ABC.EXE[3b78], The rest of the line...
I would like to get the date and bit between curly braces (process id) and then compile the table. The second stage of the task is to group this table so that I get the earliest date for each process identifier, in fact giving me the date and time of the first log entry to one process identifier, which, we hope, will approach the start time of this process instance.
What I have so far (split on another line for readability)
gci -filter *.log -r | select-string '(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}), ABC.EXE\[(.{4})' | % { $_.matches } | % { $_.groups } | % { $_.value }
spills out grips. I would like to ignore the first capture and combine the second and third in one line.
Help? You are welcome?
Edit: DOH! I can not answer my question. So that...
OK, I think I'm on the right track. The SO here question helped me get the individual parts that I wanted, namely:
$_.matches[0].groups[1].value, $_.matches[0].groups[2].value
Then, the MSDN article here shows how to “group” bits into an object, which allows you to group / sort / manipulate. Final result
gci -filter *.log | select-string '(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}), ABC.EXE\[(.{4})' | % { new-object object | add-member NoteProperty Name $_.matches[0].groups[1].value -passthru | add-member NoteProperty PId $_.matches[0].groups[2].value -passthru }
Pretty dirty, so if anyone knows a cleaner way to do this, please let me know.