Filter a text file to get unique entries based on the value in column 3

I know a little bash, but I have to deal with the problem of filtering the file. I will explain an example:

For a text file such as the following (file1):

10.10.12 bib24 Avenger goodone 10.10.12 bib21 The_Dark_Knight_Rises betterone 10.10.12 bib53 Avenger goodone 10.10.12 bib35 Ice_Age wow 11.10.12 bib53 TheAmazingSpiderMan nice 11.10.12 bib54 TheAmazingSpiderMan nice 11.10.12 bib01 Avenger goodone 12.10.12 bib29 Avenger goodone 12.10.12 bib11 TheAmazingSpiderMan nice 12.10.12 bib03 Ice_Age wow 12.10.12 bib98 Ice_Age wow 14.10.12 bib12 Ice_Age wow 

This is the result I want (file2):

 10.10.12 bib24 Avenger goodone 10.10.12 bib21 The_Dark_Knight_Rises betterone 10.10.12 bib35 Ice_Age wow 11.10.12 bib53 TheAmazingSpiderMan nice 

So my question is: which command to use to get this result (file2)? (i.e. the first entry into the movie, not taking into account columns / fields 1, 2, and 4).

I hope this is clear enough.

+4
source share
3 answers

Here is one way: awk :

 awk '!a[$3]++' file.txt 

Results:

 10.10.12 bib24 Avenger goodone 10.10.12 bib21 The_Dark_Knight_Rises betterone 10.10.12 bib35 Ice_Age wow 11.10.12 bib53 TheAmazingSpiderMan nice 
+5
source

Try to do:

 sort -u -k3 file.txt 

Output

 10.10.12 bib24 Avenger goodone 10.10.12 bib35 Ice_Age wow 11.10.12 bib53 TheAmazingSpiderMan nice 10.10.12 bib21 The_Dark_Knight_Rises betterone 
+4
source

For rusty csh users:

Use this:

 awk '{c[$3]++} {if (c[$3] == 1) print $0}' file.txt 

Because with the original answer there will be an error β€œevent not found” (it can also make β€œ!” A normal character !, but it is easier to read and use)

0
source

Source: https://habr.com/ru/post/1440549/


All Articles