Count line appearance in input file

There is a shell script that should process the incoming text file.

This text file contains lines divided into several lines, and each line is present more than once.

The shell script should read this text file and output a line and the amount of each line.

Consider a text file:

Tim

tim

Mark

Mark

Allen

ALLen

ALLEN

The output should be like this:

Tim appears 2 times

The icon appears 2 times

Allen appears 3 times

Now I can print the appearance of the lines, but this is repeated the number of times the line occurs, that is, "Tim appears 2 times" is printed twice. I tried to replace the NULL string as soon as I count its appearance, but for some reason sed is not working, maybe I am not calling it in the right place (or correctly)

#!/bin/bash INPUT_FILE="$1" declare -a LIST_CHARS if [ $# -ne 1 ] then echo "Usage: $0 <file_name>" exit 1 fi if [ ! -f $INPUT_FILE ] then echo "$INPUT_FILE does not exists. Please specify correct file name" exit 2 fi while read line do while read i do echo $line count=`grep -i $line | wc -l` echo "String $line appears $count times" done < $INPUT_FILE done < $INPUT_FILE 
+4
source share
4 answers

The classic awk solution is something like:

  $ awk 'NF {count [toupper ($ 0)] ++} 
     END {for (name in count) {print name "appears" count [name] "times"};
 } 'input
+8
source

You can also use sort and uniq with flags to ignore case:

 sort -f FILE | uniq -ic 

A simple sed command can change the output format to the specified:

 s/^ *\([0-9]\+\) \(.*\)/\2 appears \1 times/ 
+11
source

Assuming data.txt contains your word after running the script.

 while read line do uc=$(echo $line | tr [az] [AZ] | tr -d ' ') echo $uc $(grep -i "$uc" strs.txt | wc -l) done< data.txt | sort | uniq 

Output.

 31 ALLEN 6 MARK 4 MOKADDIM 1 SHIPLU 1 TIM 4 

Another variant:

 sort -f data.txt | uniq -i -c | while read num word do echo $(echo $word|tr [az] [AZ]) appeard $num times done 

Note. I see that your text file contains empty lines. Thus, 31 in the output contains the number of empty lines.

+1
source
 for i in `sort filename |uniq -c`` do # --if to print data as u like-- done 
+1
source

Source: https://habr.com/ru/post/1392438/


All Articles