Parsing a configuration file in bash

Here is my configuration file (dansguardian-config):

banned-phrase duck banned-site allaboutbirds.org 

I want to write a bash script that will read this configuration file and create some other files for me. Here is what I still have, it is basically pseudo-code:

 while read line do # if line starts with "banned-phrase" # add rest of line to file bannedphraselist # fi # if line starts with "banned-site" # add rest of line to file bannedsitelist # fi done < dansguardian-config 

I'm not sure if I need to use grep, sed, awk or what.

Hope this makes sense. I just hate DansGuardian listings.

+4
source share
6 answers

With awk :

 $ cat config banned-phrase duck frog bird banned-phrase horse banned-site allaboutbirds.org duckduckgoose.net banned-site froggingbirds.gov $ awk '$1=="banned-phrase"{for(i=2;i<=NF;i++)print $i >"bannedphraselist"} $1=="banned-site"{for(i=2;i<=NF;i++)print $i >"bannedsitelist"}' config $ cat bannedphraselist duck frog bird horse $ cat bannedsitelist allaboutbirds.org duckduckgoose.net froggingbirds.gov 

Explanation:

In awk by default, each line is separated into fields by a space, and each field is processed by $i , where i is the field i th i.e. the first field on each line is $1 , the second field on each line is $2 upto $NF , where NF is a variable that contains the number of fields on a given line.

So the script is simple:

  • Check the first field against our required lines $1=="banned-phrase"

  • If the first comparable field then iterates over all the other for(i=2;i<=NF;i++) fields for(i=2;i<=NF;i++) and prints each field print $i and redirects the output to the file >"bannedphraselist" .

+5
source

You could do

 sed -n 's/^banned-phrase *//p' dansguardian-config > bannedphraselist sed -n 's/^banned-site *//p' dansguardian-config > bannedsitelist 

Although this means reading the file twice. I doubt the potential loss of performance matters.

+4
source

You can view several variables at once; by default, they are separated by spaces.

 while read command target; do case "$command" in banned-phrase) echo "$target" >>bannedphraselist;; banned-site) echo "$target" >>bannedsitelist;; "") ;; # blank line *) echo >&2 "$0: unrecognized config directive '$command'";; esac done < dansguardian-config 

As an example. A smarter implementation will first check the list files, make sure that everything has not been banned, etc.

+4
source

What is the problem with all solutions using echo text >> file ? You can check with strace that at each such stage, file opened, then placed at the end, then text written and the file is closed. So if 1000 times echo text >> file , then there will be 1000 open , lseek , write , close . The number of open , lseek and close can be significantly reduced as follows:

 while read key val; do case $key in banned-phrase) echo $val>&2;; banned-site) echo $val;; esac done >bannedsitelist 2>bannedphraselist <dansguardian-config 

Stdout and stderr are redirected to files and remain open while the loop is alive. Thus, files are opened once and closed once. No need for lseek. In addition, file caching is used more, because unnecessary close calls will not buffer buffers each time.

+1
source
 while read name value do if [ $name = banned-phrase ] then echo $value >> bannedphraselist elif [ $name = banned-site ] then echo $value >> bannedsitelist fi done < dansguardian-config 
0
source

Better to use awk:

 awk '$1 ~ /^banned-phrase/{print $2 >> "bannedphraselist"} $1 ~ /^banned-site/{print $2 >> "bannedsitelist"}' dansguardian-config 
0
source

Source: https://habr.com/ru/post/1482273/


All Articles