I came to a decision, albeit a little imperfect.
What you can do is manually create the number of Pandas SparseSeries from the columns, combine them into a dict, and then apply this dict to the DataFrame (and not to the SparseDataFrame). Casting as a SparseDataFrame currently hits an immature constructor that deconstructs the entire object into a dense and then back into a sparse form regardless of input. However, creating SparseSeries in a regular DataFrame supports sparseness, but creates a viable and otherwise complete DataFrame.
It demonstrates how to do this, more is written for clarity than for performance. One difference with my own implementation is that I created a dictation of sparse vectors as an understanding of a dict instead of a loop.
import pandas import numpy df = pandas.DataFrame({'user_id':[1,2,1,4],'value':[100,100,200,200]}) # Get unique users and unique features num_rows = len(df['user_id'].unique()) num_features = len(df['value'].unique()) unique_users = df['user_id'].unique().copy() unique_features = df['value'].unique().copy() unique_users.sort() unique_features.sort() # assign each user_id to a row_number user_lookup = pandas.DataFrame({'uid':range(num_rows), 'user_id':unique_users}) vec_dict = {} # Create a sparse vector for each feature for i in range(num_features): users_with_feature = df[df['value']==unique_features[i]]['user_id'] uid_rows = user_lookup[user_lookup['user_id'].isin(users_with_feature)]['uid'] vec = numpy.zeros(num_rows) vec[uid_rows] = 1 sparse_vec = pandas.Series(vec).to_sparse(fill_value=0) vec_dict[unique_features[i]] = sparse_vec my_pandas_frame = pandas.DataFrame(vec_dict) my_pandas_frame = my_pandas_frame.set_index(user_lookup['user_id'])
Results:
>>> my_pandas_frame 100 200 user_id 1 1 1 2 1 0 4 0 1 >>> type(my_pandas_frame) <class 'pandas.core.frame.DataFrame'> >>> type(my_pandas_frame[100]) <class 'pandas.sparse.series.SparseSeries'>
Completed, but still sparse. There are a few warnings if you make a simple copy or a subset is not in place, then she will forget herself and try to remake it into a tight one, but for my purposes I am happy with it.
source share