Linux Terminal: finding the number of lines longer than x

I come to you with a problem that puzzled me. I am trying to find the number of lines in a file (in this case, the html of a particular site) longer than x (which in this case is 80).

For example: google.com has (by checking with wc -l) has 7 lines, two of which are longer than 80 (checking with awk '{print NF}'). I am trying to find a way to check how many lines are longer than 80, and then output that number.

My team looks like this: wget -qO - google.com | awk '{print NF}' | sort -g wget -qO - google.com | awk '{print NF}' | sort -g

I was thinking just to calculate which lines have values โ€‹โ€‹greater than 80, but I cannot understand the syntax for this. Perhaps awk? Maybe I'm going to do it in the most awkward way and hit the wall for some reason.

Thanks for the help!

Edit: The unit of measure is characters. The team should be able to find the number of lines with more than 80 characters.

+5
source share
3 answers

If you need a number of lines longer than 80 characters (there are no units for your question), grep is a good candidate:

 grep -c '.\{80\}' 

So:

 wget -qO - google.com | grep -c '.\{80\}' 

outputs 6.

+3
source

Using awk:

 wget -qO - google.com | awk 'NF>80{count++} END{print count}' 

This gives 2 as output, as there are two lines with more than 80 fields.

If you mean the number of characters (I assumed that the fields are based on what you have in the question), then:

 wget -qO - google.com | awk 'length($0)>80{c++} END{print c}' 

which gives 6 .

+2
source

Blue Moon's answer (in its original version) will print the number of fields, not the length of the string. Since the default field separator in awk is ' ' (space), you get the number of words, not the length of the string.

Try the following:

 wget -q0 - google.com | awk '{ if (length($0) > 80) count++; } END{print count}' 
+2
source

Source: https://habr.com/ru/post/1207247/


All Articles