I am just learning Python using Python 2.7. I have a csv file with two columns. Columns:
Coll_id: records can be single collectors or can be groups
Member_Coll_id: if Coll_id is the only collector, then the value will be null. If Coll_id is a group, then for each member in the group there will be one row.
Sample is here:
Coll_id,Participant_Coll_id<br> ARA,ARG ARA,RAT ARG,NULL BRSAR,SGMB BRSAR,SANTM BRSAR,CRSR BRSAR,RAT CRSR,NULL DBY,NULL HZIE,NULL RAT,NULL SANTM,NULL SGMB,NULL ARG,NULL DRS,CRSR DRS,RAT DRS,ARG
For each collector (coll_id), I am trying to create a list of all the other collectors with whom they have gathered. I tried to hide the code to do the following, and now it is close:
#This is giving me a dictionary with each COLL_ID having a list of PARTICIPANT_COLL_IDs with open('colls_mv1.csv', 'r') as f: reader = csv.DictReader(f, ['COLL_ID', 'PARTICIPANT_COLL_ID']) data1 = defaultdict(list) for line in reader: data1[line['COLL_ID']].append(line['PARTICIPANT_COLL_ID'])
I get the following output:
{'SGMB': [['SGMB', 'SANTM', 'CRSR', 'RAT']], 'CRSR': [['SGMB', 'SANTM', 'CRSR', 'RAT'], ['CRSR', 'RAT', 'ARG']], 'RAT': [['ARG', 'RAT'], ['SGMB', 'SANTM', 'CRSR', 'RAT'], ['CRSR', 'RAT', 'ARG']], 'PARTICIPANT_COLL_ID': [['PARTICIPANT_COLL_ID']], 'ARG': [['ARG', 'RAT'], ['CRSR', 'RAT', 'ARG']], 'SANTM': [['SGMB', 'SANTM', 'CRSR', 'RAT']]}
I would like to combine the lists of values โโtogether for each key, remove duplicates and remove the key from the list of values:
{'SGMB': ['SANTM', 'CRSR', 'RAT'], 'CRSR': ['SGMB', 'SANTM', 'RAT', 'ARG'], 'RAT': ['ARG', 'SGMB', 'SANTM', 'CRSR'], 'PARTICIPANT_COLL_ID': [['PARTICIPANT_COLL_ID']], 'ARG': ['RAT', 'CRSR'], 'SANTM': ['SGMB', 'CRSR', 'RAT']}