How to count the number of words in all directory files?

Im trying to count a specific word in a whole directory. Is it possible?

Say, for example, there is a directory with 100 files, all files of which may contain the word "aaa" in them. How can I count the number of "aaa" in all files in this directory?

I tried something like:

zegrep "xception" `find . -name '*auth*application*' | wc -l 

But it does not work.

+43
linux unix grep find count
May 26 '11 at 7:20 a.m.
source share
8 answers

grep -roh aaa . | wc -w

Recursively move all the files and directories in the current aaa search directory and print only matches, not the entire string. Then just use wc to count the number of words.

+76
May 26 '11 at 8:30
source share

Another solution based on find and grep .

 find . -type f -exec grep -o aaa {} \; | wc -l 

It should correctly handle file names with spaces in them.

+6
May 28 '11 at 14:35
source share

Let AWK be used!

 $ function wordfrequency() { awk 'BEGIN { FS="[^a-zA-Z]+" } { for (i=1; i<=NF; i++) { word = tolower($i); words[word]++ } } END { for (w in words) printf("%3d %s\n", words[w], w) } ' | sort -rn; } $ cat your_file.txt | wordfrequency 

It lists the frequency of each word in the provided file. If you want to see the occurrences of your word, you can simply do this:

 $ cat your_file.txt | wordfrequency | grep yourword 

To find occurrences of your word in all files in a directory (non-recursively), you can do this:

 $ cat * | wordfrequency | grep yourword 

To find the occurrences of your word in all files in a directory (and its subdirectories), you can do this:

 $ find . -type f | xargs cat | wordfrequency | grep yourword 

Source: AWK-ward Ruby

+3
Dec 15 '14 at 10:40
source share
 find .|xargs perl -p -e 's/ /\n'|xargs grep aaa|wc -l 
+1
May 26 '11 at 7:33 a.m.
source share

Use grep the easiest way. Try grep --help for more information.




  • To get the number of words in a particular file :

     grep -c <word> <file_name> 

    Example:

     grep -c 'aaa' abc_report.csv 

    Output:

     445 



  1. To get the total number of words in a directory:

     grep -c -R <word> 

    Example:

     grep -c -R 'aaa' 

    Output:

     abc_report.csv:445 lmn_report.csv:129 pqr_report.csv:445 my_folder/xyz_report.csv:408 
+1
Mar 13 '16 at 3:22
source share

build files and grep output: cat $(find /usr/share/doc/ -name '*.txt') | zegrep -ic '\<exception\>' cat $(find /usr/share/doc/ -name '*.txt') | zegrep -ic '\<exception\>'

if you want "exclusive" to match, do not use "\ <" and '\>' around the word.

0
May 26 '11 at 7:27 a.m.
source share

How to start with:

 cat * | sed 's/ /\n/g' | grep '^aaa$' | wc -l 

as in the following decryption:

 pax$ cat file1 this is a file number 1 pax$ cat file2 And this file is file number 2, a slightly larger file pax$ cat file[12] | sed 's/ /\n/g' | grep 'file$' | wc -l 4 

sed converts spaces to newlines (you can include other spaces, such as tabs, with sed 's/[ \t]/\n/g' ). grep just gets those lines that have the desired word, and then wc counts those lines for you.

Now there may be cross cases where this script does not work, but in the vast majority of cases it should be fine.

If you need a whole tree (and not just one directory level), you can use somthing like:

 ( find . -name '*.txt' -exec cat {} ';' ) | sed 's/ /\n/g' | grep '^aaa$' | wc -l 
0
May 26 '11 at 7:28 a.m.
source share

There is also grep syntax syntax for matching only words:

 # based on Carlos Campderrós solution posted in this thread man grep | less -p '\<' grep -roh '\<aaa\>' . | wc -l 

For another regex syntax matching words, see:

 man re_format | less -p '\[\[:<:\]\]' 
0
May 28 '11 at 18:20
source share



All Articles