Optimizing grep (or using AWK) in a shell script

Question

Optimizing grep (or using AWK) in a shell script

In my shell script, I am trying to search using the terms found in the source $ file, with the same $ targetfile over and over again.

My $ sourcefile is formatted as such:

pattern1 pattern2 etc...

Inefficient loop that I have to execute with:

 for line in $(< $sourcefile);do fgrep $line $targetfile | fgrep "RID" >> $outputfile done

As far as I understand, it would be possible to improve this by loading the entire target target file into memory or, possibly, using AWK?

thanks

+4

shell grep awk

Ode May 12, '10 at 17:16

source share

3 answers

Sed solution:

sed 's/$.*$/\/\1\/p/' $sourcefile | sed -nf - $targetfile

This converts each line of $ sourcefile to a sed mapping command:

Matchstring

to

/ MatchString / p

You will need to avoid special characters to make it reliable.

+2

Nathan kidd May 12, '10 at 17:25

source share

Using awk to read in the source file, then search in targetfile (untested):

 nawk ' NR == FNR {patterns[$0]++; next} /RID/ { for (pattern in patterns) { # since fgrep considers patterns as strings not regular expressions, # use string lookup and not pattern matching ("~" operator). if (index($0, pattern) > 0) { print break } } } ' "$sourcefile" "$targetfile" > "$outputfile"

Will also be with gawk .

+2

glenn jackman May 12, '10 at 18:42

source share

Arkku · Accepted Answer · 2010-05-12T20:16:49+0000

fgrep -f "$sourcefile" "$targetfile" I missing something, or why not just fgrep -f "$sourcefile" "$targetfile" ?

Optimizing grep (or using AWK) in a shell script

More articles: