How can I use sed to extract lines between two patterns and do post processing on it in a loop?

I want to do something like this. Say I have the text below:

Start-pattern  
orange  
apple  
grape  
orange  
orange  
End-pattern  
#######  
bla bla bla  
########  
Start-pattern  
orange  
apple  
grape  
apple  
orange  
End-pattern  
#######
bla bla bla
########
Start-pattern  
orange  
orange  
orange  
End-pattern  
#######  
bla bla bla  
########

Here I want to print how many oranges, apples and grapes are between each of Start-patternand End-pattern.

In the above example, we have 3 "orange", 1 "apple" and 1 "grape" between the 1st start and the end. 2 "orange", 2 "apple" and 1 "grape" in the 2nd SP and EP, etc.

Waiting for your valuable answers.

+4
source share
2 answers

You can try this one awk:

awk '$1 ~ /^Start-pattern$/{p=1;next} $1 ~ /^End-pattern$/{p=0; for (var in a) {print var,a[var];a[var]=""}; print "######"; next} p{a[$1]++}' file

More readable awk:

$1 ~ /^Start-pattern$/ {
    p=1;
    next
}
$1 ~ /^End-pattern$/ {
    p=0;
    for (var in a) {
        print var,a[var];
        a[var]=""
    }
    print "######";
    next
} 
p {
    a[$1]++;
}

Explanation:

awk 3 .

  • Start-pattern, p=1.
  • End-pattern, p=0. a[] .
  • .
+3

(GNU sed, echo, sort uniq):

sed -nr '/Start/,/End/!b;/Start/h;//!H;/End/!b;x;s/^[^\n]*\n(.*)\n.*/echo "\1"|sort|uniq -c/e;s/\n//g;p' file

seds grep, , -n. Start End (HS) End, (PS) HS. ; , uniq. PS.

+1

Source: https://habr.com/ru/post/1657984/


All Articles