I am reading a file like this:
cat access_logs | grep Ruby
To determine which IP addresses are accessing one of my files. He returns a huge list. I want to remove semi-duplicates, i.e. These two lines are technically the same, with the exception of different time and date stamps. In a massive list with thousands of retries - is there a way to get unique IP addresses?
1.2.3.4 - - [13/Apr/2014:14:20:17 -0400] "GET /color.txt HTTP/1.1" 404 207 "-" "Ruby"
1.2.3.4 - - [13/Apr/2014:14:20:38 -0400] "GET /color.txt HTTP/1.1" 404 207 "-" "Ruby"
1.2.3.4 - - [13/Apr/2014:15:20:17 -0400] "GET /color.txt HTTP/1.1" 404 207 "-" "Ruby"
1.2.3.4 - - [13/Apr/2014:15:20:38 -0400] "GET /color.txt HTTP/1.1" 404 207 "-" "Ruby"
So, for example, will these 4 lines be trimmed into only one line?
Jason source
share