Optimizing grep (or using AWK) in a shell script

In my shell script, I am trying to search using the terms found in the source $ file, with the same $ targetfile over and over again.

My $ sourcefile is formatted as such:

pattern1 pattern2 etc... 

Inefficient loop that I have to execute with:

 for line in $(< $sourcefile);do fgrep $line $targetfile | fgrep "RID" >> $outputfile done 

As far as I understand, it would be possible to improve this by loading the entire target target file into memory or, possibly, using AWK?

thanks

+4
source share
3 answers

fgrep -f "$sourcefile" "$targetfile" I missing something, or why not just fgrep -f "$sourcefile" "$targetfile" ?

+7
source

Sed solution:

sed 's/\(.*\)/\/\1\/p/' $sourcefile | sed -nf - $targetfile

This converts each line of $ sourcefile to a sed mapping command:

Matchstring

to

/ MatchString / p

You will need to avoid special characters to make it reliable.

+2
source

Using awk to read in the source file, then search in targetfile (untested):

 nawk ' NR == FNR {patterns[$0]++; next} /RID/ { for (pattern in patterns) { # since fgrep considers patterns as strings not regular expressions, # use string lookup and not pattern matching ("~" operator). if (index($0, pattern) > 0) { print break } } } ' "$sourcefile" "$targetfile" > "$outputfile" 

Will also be with gawk .

+2
source

Source: https://habr.com/ru/post/1309582/


All Articles