I am looking for an effective way to remove zeros from a list of dictionaries created with pd.DataFrame. Take the following example:
df = pd.DataFrame([[1, 2], [0, 4]], columns=['a', 'b'], index=['x', 'y'])
df.to_dict('records')
[{'a': 1, 'b': 2}, {'a': 0, 'b': 4}]
I would like to:
[{'a': 1, 'b': 2}, {'b': 4}]
I have a very large sparse frame, saving all zeros is inefficient. Since the data core is large, I am looking for a faster solution than a loop using a dictionary data frame and deleting zeros, for example, the following works, but it works very slowly and uses large amounts of memory.
new_records = []
for record in df.to_dict('records'):
new_records.append(dict((k, v) for k, v in record.items() if v))
Is there a more efficient method or approach to this?
source
share