How to quickly delete lines in a file that contains items from a list in another file in BASH?

Question

How to quickly delete lines in a file that contains items from a list in another file in BASH?

I have a file with a name words.txtcontaining a list of words. I also have a file named file.txtcontaining a sentence per line. I need to quickly remove any lines in file.txtthat contain one of the lines from words.txt, but only if a match is found somewhere between {and }.

eg. file.txt:

Once upon a time there was a cat.
{The cat} lived in the forest.
The {cat really liked to} eat mice.

eg. words.txt:

cat
mice

Output Example:

Once upon a time there was a cat.

It is deleted because "cat" is on these two lines, and the words are also between {and }.

The following script successfully completes this task:

while read -r line
do
    sed -i "/{.*$line.*}/d" file.txt
done < words.txt

script . words.txt , while . sed -f, , , , , , .

script?

+4

optimization bash sed

Village 06 . '14 12:08

6

konsolebox · Answer 1 · 2014-06-06T12:34:44+0000

awk:

awk 'NR==FNR{a["{[^{}]*"$0"[^{}]*}"]++;next}{for(i in a)if($0~i)next;b[j++]=$0}END{printf "">FILENAME;for(i=0;i in b;++i)print b[i]>FILENAME}' words.txt file.txt

file.txt , .

Once upon a time there was a cat.

:

awk '
    NR == FNR {
        a["{[^{}]*" $0 "[^{}]*}"]++
        next
    }
    {
        for (i in a)
            if ($0 ~ i)
                next
        b[j++] = $0
    }
    END {
        printf "" > FILENAME
        for (i = 0; i in b; ++i)
            print b[i] > FILENAME
    }
' words.txt file.txt

, , , awk , stdout. :

awk '
    NR == FNR {
        a["{[^{}]*" $0 "[^{}]*}"]++
        next
    }
    {
        for (i in a)
            if ($0 ~ i)
                next
    }
    1
' words.txt file.txt

Subbeh · Answer 2 · 2014-06-06T12:14:18+0000

grep 2 :

grep -vf words.txt file.txt

Elwinar · Answer 3 · 2014-06-06T12:16:33+0000

, grep . :

grep -f words.txt -v file.txt

f grep words.txt
v , , .

{}, , , ( , ).

pgl · Answer 4 · 2014-06-06T12:25:12+0000

, :

sed -e 's/.*/{.*&.*}/' words.txt | grep -vf- file.txt > out ; mv out file.txt

words.txt " " grep.

Tom Fenech · Answer 5 · 2014-06-06T12:25:17+0000

:

words.txt {.* .*}:

awk '{ print "{.*" $0 ".*}" }' words.txt > wrapped.txt

grep :
```
grep -v -f wrapped.txt file.txt
```

, words.txt , awk ( words.txt ) .

, :

awk '{ print "{.*" $0 ".*}" }' words.txt | grep -v -f - file.txt

- , grep stdin

words.txt , awk:

awk 'NR==FNR{a[$0]++;next}{p=1;for(i in a){if ($0 ~ "{.*" i ".*}") { p=0; break}}}p' words.txt file.txt

:

awk 'NR==FNR { a[$0]++; next }
     { 
         p=1
         for (i in a) {
             if ($0 ~ "{.*" i ".*}") { p=0; break }
         }
     }p' words.txt file.txt

, words.txt. file.txt. p , . , p false. p true, , .

Charles Duffy · Answer 6 · 2014-06-06T12:25:22+0000

bash (4.x):

#!/bin/env bash4
# ^-- MUST start with a /bin/bash shebang, NOT /bin/sh

readarray -t words <words.txt          # read words into array
IFS='|'                                # use | as delimiter when expanding $*
words_re="[{].*(${words[*]}).*[}]"     # form a regex matching all words
while read -r; do                      # for each line in file...
  if ! [[ $REPLY =~ $words_re ]]; then # ...check whether it matches...
    printf '%s\n' "$REPLY"             # ...and print it if not.
  fi
done <file.txt

bash , awk, (O(n+m), sed -i O(n*m)), , .

How to quickly delete lines in a file that contains items from a list in another file in BASH?

More articles: