Multi-line pattern matching in bash

I have a long file like

Processin SCRIPT10 file.. Submitted batch job 1715572 Processin SCRIPT100 file.. Processin SCRIPT1000 file.. Submitted batch job 1715574 Processin SCRIPT10000 file.. Processin SCRIPT10001 file.. Processin SCRIPT10002 file.. Submitted batch job 1715577 Processin SCRIPT10003 file.. Submitted batch job 1715578 Processin SCRIPT10004 file.. Submitted batch job 1715579 

I want to know jobs (script names) that have not been submitted. This means that after processing the line, the batch job in the line is not executed.

So far, I have been trying to complete this task using

 pcregrep -M "Processin.*\n.*Processin" execScripts2.log | awk 'NR % 2 == 0' 

But it does not cope properly with a situation where several scripts are not processed. It outputs, surprisingly, only the lines SCRIPT1000 and SCRIPT10001. Can you show me the best liner?

Ideally, the output will be only lines without β€œSent” on the next line (or just a script), which means:

 SCRIPT100 SCRIPT10000 SCRIPT10001 

Thanks.

+5
source share
2 answers

This awk can do the job:

 awk -vs='Submitted' '$1 != s{if(p != "") print p; p=$2} $1 == s{p=""}' file SCRIPT100 SCRIPT10000 SCRIPT10001 

Link: Effective AWK Programming

+3
source

Without using awk you can write the bash command / file command and run it. If you have less knowledge of awk , then this bash script works better if you want to continue customizing.

 #!/bin/bash tempText="" Processing="Processin" while read line do tempText=$line if [[ "$line" == Processin* ]]; tempText=$line then read line if [[ "$line" != Submitted* ]]; then echo $tempText tempText=$line while read line do if [[ "$line" != Submitted* ]]; then echo $tempText tempText=$line else break fi done fi fi 

Run with ./check.sh filename

The current answer is working fine.

0
source

Source: https://habr.com/ru/post/1268167/


All Articles