I have a file that looks like this:
64fe12c7-b50c-4f63-b292-99f4ed74e5aa, ip, 1.2.3.4,
64fe12c7-b50c-4f63-b292-99f4ed74e5aa, ip, 4.5.6.7,
bacd8a9d-807f-4ae9-95d2-f7cc17222cab, ip, 0.0.0.0/0, silly string
bacd8a9d-807f-4ae9-95d2-f7cc17222cab, ip, 0.0.0.0/0, crazy town
db86d211-0b09-4a8f-b222-a21a54ad2f9c, ip, 8.9.0.1, wild wood
db86d211-0b09-4a8f-b222-a21a54ad2f9c, ip, 0.0.0.0/0, wacky tabacky
611f8cf5-f6f2-4f3a-ad24-12245652a7bd, ip, 0.0.0.0/0, cuckoo cachoo
I would like to extract a list of only unique GUIDs, where
- Column 3 GUID missing 0.0.0.0/0
- column 3 corresponds to 0.0.0.0/0, and there is more than one instance of the GUID and where at least one of the matches is not equal to 0.0.0.0/0
In this case, the desired output would be:
64fe12c7-b50c-4f63-b292-99f4ed74e5aa
db86d211-0b09-4a8f-b222-a21a54ad2f9c
Trying to think through this, it seems to me that I should create an array / list of unique GUIDs, and then write grep the corresponding lines and start the process of the two conditions above, but I just don’t know the best ones can be done in a short script or, possibly, in grep / awk / sort / cut single liner. Appreciate any help!
(the source file is 4 csv columns, where the 4th column is often null)