How to remove 0 convert pandas dataframe to record

I am looking for an effective way to remove zeros from a list of dictionaries created with pd.DataFrame. Take the following example:

df = pd.DataFrame([[1, 2], [0, 4]], columns=['a', 'b'], index=['x', 'y'])
df.to_dict('records')

[{'a': 1, 'b': 2}, {'a': 0, 'b': 4}]

I would like to:

[{'a': 1, 'b': 2}, {'b': 4}]

I have a very large sparse frame, saving all zeros is inefficient. Since the data core is large, I am looking for a faster solution than a loop using a dictionary data frame and deleting zeros, for example, the following works, but it works very slowly and uses large amounts of memory.

new_records = []
for record in df.to_dict('records'):
    new_records.append(dict((k, v) for k, v in record.items() if v))

Is there a more efficient method or approach to this?

+4
source share
4 answers

x-y: . , :

In [8]: from scipy import sparse

In [9]: df
Out[9]:
   a  b
x  1  2
y  0  4

In [10]: column_names = df.columns

In [11]: sm = sparse.csc_matrix(df.values)

, , piRSquared, pandas :

df.to_sparse(0)
+2

[r[r != 0].to_dict() for _, r in df.iterrows()]

[{'a': 1, 'b': 2}, {'b': 4}]
+1
> df.apply(lambda row: row[row != 0].to_dict(), 1)
x    {'b': 2, 'a': 1}
y            {'b': 4}
dtype: object
+1

- ( ), , , pd.Dataframe, , numpy.flatnonzero() numpy. , , , , .

import numpy as np

new_records = []
columns=np.array(df.columns.values)
for record in df.as_matrix():
    idx=np.flatnonzero(record)
    new_records.append( dict( zip( columns[idx], record[idx] ) ) )

- , :

[{'a': 1, 'b': 2}, {'b': 4}]

:

  • First, extract the column values ​​to use as keys for each new dictionary and make them a numpy array to use the index of the numpy array np.array(df.columns.values).
  • Then convert the dataframe to a numpy matrix df.as_matrix().
  • For each record, get an index for positions without zero np.flatnonzero()
  • Make a dictionary using the cut version of columns and records. Index is used heredict( zip( columns[idx], record[idx] ) )
  • Add new dictionary to new_records
0
source

Source: https://habr.com/ru/post/1663791/


All Articles