How to count the number of words in all directory files?

Question

How to count the number of words in all directory files?

Im trying to count a specific word in a whole directory. Is it possible?

Say, for example, there is a directory with 100 files, all files of which may contain the word "aaa" in them. How can I count the number of "aaa" in all files in this directory?

I tried something like:

zegrep "xception" `find . -name '*auth*application*' | wc -l

But it does not work.

+43

linux unix grep find count

Ashish Sharma May 26 '11 at 7:20 a.m.

source share

8 answers

Another solution based on find and grep .

 find . -type f -exec grep -o aaa {} \; | wc -l

It should correctly handle file names with spaces in them.

+6

Fredrik Pihl May 28 '11 at 14:35

source share

Let AWK be used!

 $ function wordfrequency() { awk 'BEGIN { FS="[^a-zA-Z]+" } { for (i=1; i<=NF; i++) { word = tolower($i); words[word]++ } } END { for (w in words) printf("%3d %s\n", words[w], w) } ' | sort -rn; } $ cat your_file.txt | wordfrequency

It lists the frequency of each word in the provided file. If you want to see the occurrences of your word, you can simply do this:

 $ cat your_file.txt | wordfrequency | grep yourword

To find occurrences of your word in all files in a directory (non-recursively), you can do this:

 $ cat * | wordfrequency | grep yourword

To find the occurrences of your word in all files in a directory (and its subdirectories), you can do this:

 $ find . -type f | xargs cat | wordfrequency | grep yourword

Source: AWK-ward Ruby

+3

Sheharyar Dec 15 '14 at 10:40

source share

 find .|xargs perl -p -e 's/ /\n'|xargs grep aaa|wc -l

+1

Vijay May 26 '11 at 7:33 a.m.

source share

Use grep the easiest way. Try grep --help for more information.

To get the number of words in a particular file :

 grep -c <word> <file_name>

Example:

 grep -c 'aaa' abc_report.csv

Output:

To get the total number of words in a directory:

 grep -c -R <word>

Example:

 grep -c -R 'aaa'

Output:

 abc_report.csv:445 lmn_report.csv:129 pqr_report.csv:445 my_folder/xyz_report.csv:408

+1

Parag Tyagi -morpheus- Mar 13 '16 at 3:22

source share

build files and grep output: cat $(find /usr/share/doc/ -name '*.txt') | zegrep -ic '\<exception\>' cat $(find /usr/share/doc/ -name '*.txt') | zegrep -ic '\<exception\>'

if you want "exclusive" to match, do not use "\ <" and '\>' around the word.

0

jcomeau_ictx May 26 '11 at 7:27 a.m.

source share

How to start with:

 cat * | sed 's/ /\n/g' | grep '^aaa$' | wc -l

as in the following decryption:

 pax$ cat file1 this is a file number 1 pax$ cat file2 And this file is file number 2, a slightly larger file pax$ cat file[12] | sed 's/ /\n/g' | grep 'file$' | wc -l 4

sed converts spaces to newlines (you can include other spaces, such as tabs, with sed 's/[ \t]/\n/g' ). grep just gets those lines that have the desired word, and then wc counts those lines for you.

Now there may be cross cases where this script does not work, but in the vast majority of cases it should be fine.

If you need a whole tree (and not just one directory level), you can use somthing like:

 ( find . -name '*.txt' -exec cat {} ';' ) | sed 's/ /\n/g' | grep '^aaa$' | wc -l

0

paxdiablo May 26 '11 at 7:28 a.m.

source share

There is also grep syntax syntax for matching only words:

 # based on Carlos Campderrós solution posted in this thread man grep | less -p '\<' grep -roh '\<aaa\>' . | wc -l

For another regex syntax matching words, see:

 man re_format | less -p '\[\[:<:\]\]'

0

tim May 28 '11 at 18:20

source share

Carlos Campderrós · Accepted Answer · 2011-05-26 08:30

grep -roh aaa . | wc -w

Recursively move all the files and directories in the current aaa search directory and print only matches, not the entire string. Then just use wc to count the number of words.

How to count the number of words in all directory files?

Let AWK be used!

More articles: