This is the answer to one of the questions:
raw_data = [ ['975676924', '1345207523', '-1953633084', '-2041119774', '587903155'], ['1619201613', '-1384105381', '1433106581', '1445361759', '587903155'], ['-1470352544', '-1068707556', '-1002282042', '-563691616', '587903155'], ['-1958275692', '-739953679', '69580355', '-481818422', '587903155'], ['1619201613', '-739953679', '-1002282042', '-481818422', '587903155'] ] import collections data = collections.defaultdict(list) for line in raw_data: data[line[0]].extend(line[1:])
Now you have a dictionary with the id key:
defaultdict(<type 'list'>, { '1619201613': ['-1384105381', '1433106581', '1445361759', '587903155', '-739953679', '-1002282042', '-481818422', '587903155'], '-1470352544': ['-1068707556', '-1002282042', '-563691616', '587903155'], '975676924': ['1345207523', '-1953633084', '-2041119774', '587903155'], '-1958275692': ['-739953679', '69580355', '-481818422', '587903155']})
You will get the desired list by rearranging:
data_list = [[key] + value for key, value in data.items()]