Bash: search for a file with the maximum number of lines

Question

Bash: search for a file with the maximum number of lines

This is my attempt to do it.

Find all *.java files
find . -name '*.java'
Line count
wc -l
Delete last row
sed '$d'
Use AWK to find the maximum number of lines in wc output
awk 'max=="" || data=="" || $1 > max {max=$1 ; data=$2} END{ print max " " data}'

then combine it into one line

 find . -name '*.java' | xargs wc -l | sed '$d' | awk 'max=="" || data=="" || $1 > max {max=$1 ; data=$2} END{ print max " " data}'

Is there any way to implement counting only non-empty strings?

+6

unix bash awk sed wc

Marek sebera Dec 13 '11 at 11:21

source share

4 answers

 find . -name "*.java" -type f | xargs wc -l|sort -rn|grep -v ' total$'|head -1

+5

Vijay Dec 13 '11 at 12:21

source share

Something like this might work:

 find . -name '*.java'|while read filename; do nlines=`grep -v -E '^[[:space:]]*$' "$filename"|wc -l` echo $nlines $filename done|sort -nr|head -1

(edited according to Ed Morton's comment. I should have had too much coffee :-))

0

holygeek Dec 13 '11 at 11:30

source share

To get the size of all your files with awk, simply:

 $ find . -name '*.java' -print0 | xargs -0 awk ' BEGIN { for (i=1;i<ARGC;i++) size[ARGV[i]]=0 } { size[FILENAME]++ } END { for (file in size) print size[file], file } '

To get the number of non-empty lines, simply create a line in which you increase the size [] conditionally:

 $ find . -name '*.java' -print0 | xargs -0 awk ' BEGIN { for (i=1;i<ARGC;i++) size[ARGV[i]]=0 } NF { size[FILENAME]++ } END { for (file in size) print size[file], file } '

(If you want to consider lines containing only spaces as "empty", replace NF with /^./.)

To get only the file with the most non-empty lines, just change the settings:

 $ find . -name '*.java' -print0 | xargs -0 awk ' BEGIN { for (i=1;i<ARGC;i++) size[ARGV[i]]=0 } NF { size[FILENAME]++ } END { for (file in size) { if (size[file] >= maxSize) { maxSize = size[file] maxFile = file } } print maxSize, maxFile } '

0

Ed morton Nov 26 '12 at 22:20

source share

Shawn chin · Accepted Answer · 2011-12-13T12:28:02+0000

 find . -type f -name "*.java" -exec grep -H -c '[^[:space:]]' {} \; | \ sort -nr -t":" -k2 | awk -F: '{print $1; exit;}'

Replace awk with head -n1 if you also want to see the number of non-empty lines.

Team Breakdown:

 find . -type f -name "*.java" -exec grep -H -c '[^[:space:]]' {} \; '---------------------------' '-----------------------' | | for each *.java file Use grep to count non-empty lines -H includes filenames in the output (output = ./full/path/to/file.java:count) | sort -nr -t":" -k2 | awk -F: '{print $1; exit;}' '----------------' '-------------------------' | | Sort the output in Print filename of the first entry (largest count) reverse order using the then exit immediately second column (count)

Bash: search for a file with the maximum number of lines

More articles: