I am trying to make scripts that I write simpler and easier.
There are many ways to write to get the word count of all files in a folder or even all files in a subdirectory of a folder.
For example, I could write
wc */*
and I can get this output (this is the desired result):
0 0 0 10.53400000/YRI.GS000018623.NONSENSE.vcf 0 0 0 10.53400000/YRI.GS000018623.NONSTOP.vcf 0 0 0 10.53400000/YRI.GS000018623.PFAM.vcf 0 0 0 10.53400000/YRI.GS000018623.SPAN.vcf 0 0 0 10.53400000/YRI.GS000018623.SVLEN.vcf 2 20 624 10.53400000/YRI.GS000018623.SVTYPE.vcf 2 20 676 10.53400000/YRI.GS000018623.SYNONYMOUS.vcf 13 130 4435 10.53400000/YRI.GS000018623.TSS-UPSTREAM.vcf 425 4250 126381 10.53400000/YRI.GS000018623.UNKNOWN-INC.vcf
but if there are too many files, I can get an error message like the following:
-bash: /usr/bin/wc: Argument list too long
so I could make a variable and make one folder at a time, for example:
while read $FOLDER do wc $FOLDER/* >> outfile.txt done < "$FOLDER_LIST"
so this happens from one line to 5 in the same way.
Also, in one case, I want to use grep -v , then do a word count, for example:
grep -v dbsnp */* | wc
but this could be due to two errors:
- Argument list too long
- If it werenโt too long, it would give wc for all the files at once, and not for the file.
So, to repeat, I would love to do this:
grep -v dbsnp */* wc > Outfile.txt awk '{print $4,$1} Outfile.txt > Outfile.summary.txt
and return it as shown above.
Is there a very easy way to do this? Or am I looking at the loop at least? Again, I know 101 ways to do this the same way we all do, using 4-10 lines of script, but I would like for you to just enter 2 single lines in the command line ... and my knowledge about the shell is not deep enough to find out what methods will allow me to ask the OS.
EDIT -
A solution was proposed:
find -exec grep -v dbsnp {} \; | xargs -n 1 wc
This solution leads to the following conclusion:
wc: 1|0:53458644:AMBIGUOUS:CCAGGGC|-16&GCCAGGGCCAGGGC|-18&GCCAGGGCC|-19&GGCCAGGGC|-19&GCCAGGGCG|-19,.:48:48,48:4,4:0,17:-48,0,-48:0,0,-17:27:3,24:24: No such file or directory wc: 10: No such file or directory wc: 53460829: No such file or directory wc: .: Is a directory 0 0 0 . wc: AA: No such file or directory wc: CT: No such file or directory wc: .: Is a directory 0 0 0 . wc: .: Is a directory 0 0 0 .
As far as I can tell, it seems to treat each line as a file. I am still looking at other answers and thank you for your help.