Does grep delete lines that look like semps?

I am reading a file like this:

cat access_logs | grep Ruby

To determine which IP addresses are accessing one of my files. He returns a huge list. I want to remove semi-duplicates, i.e. These two lines are technically the same, with the exception of different time and date stamps. In a massive list with thousands of retries - is there a way to get unique IP addresses?

1.2.3.4 - - [13/Apr/2014:14:20:17 -0400] "GET /color.txt HTTP/1.1" 404 207 "-" "Ruby"
1.2.3.4 - - [13/Apr/2014:14:20:38 -0400] "GET /color.txt HTTP/1.1" 404 207 "-" "Ruby"
1.2.3.4 - - [13/Apr/2014:15:20:17 -0400] "GET /color.txt HTTP/1.1" 404 207 "-" "Ruby"
1.2.3.4 - - [13/Apr/2014:15:20:38 -0400] "GET /color.txt HTTP/1.1" 404 207 "-" "Ruby"

So, for example, will these 4 lines be trimmed into only one line?

+4
source share
2 answers

You can use awk:

awk '/Ruby/ && !seen[$1]++' access_logs

This will print only the first line for each IP address, even if the timestamp is different for that IP address.

To enter, enter:

1.2.3.4 - - [13/Apr/2014:14:20:17 -0400] "GET /color.txt HTTP/1.1" 404 207 "-" "Ruby"
+2
source

You can do:

awk '/Ruby/{print $1}' file | sort -u

grep + cut, , .

+3

Source: https://habr.com/ru/post/1536338/


All Articles