Unix shell: replace with dictionary

I have a file that contains some data, for example

2011-01-02 100100 1 2011-01-02 100200 0 2011-01-02 100199 3 2011-01-02 100235 4 

and have some "dictionary" in a separate file

 100100 Event1 100200 Event2 100199 Event3 100235 Event4 

and i know that

 0 - warning 1 - error 2 - critical etc... 

I need a little script with sed / awk / grep or something else that helps me get data like this

 100100 Event1 Error 100200 Event2 Warning 100199 Event3 Critical etc 

I will be grateful for ideas on how to do this in the best way, or for a working example

Update

sometimes I have such data

 2011-01-02 100100 1 2011-01-02 sometext 100200 0 2011-01-02 100199 3 2011-01-02 sometext 100235 4 

where sometext = any 6 characters (maybe this is useful information) In this case, I need integer data:

 2011-01-02 sometext EventNameFromDictionary Error 

or without "sometext"

+3
source share
3 answers
 awk 'BEGIN { lvl[0] = "warning" lvl[1] = "error" lvl[2] = "critical" } NR == FNR { evt[$1] = $2; next } { print $2, evt[$2], lvl[$3] }' dictionary infile 
+6
source

I hope perl is ok too:

 #!/usr/bin/perl use strict; use warnings; open(DICT, 'dict.txt') or die; my %dict = %{{ map { my ($id, $name) = split; $id => $name } (<DICT>) }}; close(DICT); my %level = ( 0 => "warning", 1 => "error", 2 => "critical" ); open(EVTS, 'events.txt') or die; while (<EVTS>) { my ($d, $i, $l) = split; $i = $dict{$i} || $i; # lookup $l = $level{$l} || $l; # lookup print "$d\t$i\t$l\n"; } 

Output:

 $ ./script.pl 2011-01-02 Event1 error 2011-01-02 Event2 warning 2011-01-02 Event3 3 2011-01-02 Event4 4 
0
source

Adding a new answer for the new requirement and due to limited formatting options inside the comment:

 awk 'BEGIN { lvl[0] = "warning" lvl[1] = "error" lvl[2] = "critical" } NR == FNR { evt[$1] = $2; next } { if (NF > 3) { idx = 3; $1 = $1 OFS $2 } else idx = 2 print $1, $idx in evt ? \ evt[$idx] : $idx, $++idx in lvl ? \ lvl[$idx] : $idx }' dictionary infile 

You do not need to avoid newlines inside the tertiary operator if you use GNU awk.

Some awk implementations may have problems with this part:

 $++idx in lvl ? lvl[$idx] : $idx 

If you use one of them, change it to:

 $(idx + 1) in lvl ? lvl[$(idx + 1)] : $(idx + 1) 

OK, added comments:

 awk 'BEGIN { lvl[0] = "warning" # map the error levels lvl[1] = "error" lvl[2] = "critical" } NR == FNR { # while reading the first # non-empty input file evt[$1] = $2 # build the associative array evt next # skip the rest of the program # keyed by the value of the first column # the second column represents the values } { # now reading the rest of the input if (NF > 3) { # if the number of columns is greater than 3 idx = 3 # set idx to 3 (the key in evt) $1 = $1 OFS $2 # and merge $1 and $2 } else idx = 2 # else set idx to 2 print $1, \ # print the value of the first column $idx in evt ? \ # if the value of the second (or the third, \ # depeneding on the value of idx), is an existing \ # key in the evt array, print its value evt[$idx] : $idx, \ # otherwise print the actual column value $++idx in lvl ? \ # the same here, but first increment the idx lvl[$idx] : $idx # because we're searching the lvl array now }' dictionary infile 
0
source

Source: https://habr.com/ru/post/1492271/


All Articles