Pandas - Get dummies for specific values ​​only

I have a Pandas series of 10,000 lines that is filled with one alphabet, starting from A to Z. However, I want to create dummy data frames for A, B and C only using Pandas get_dummies . How do I get around this?

I don’t want to get dummies for all the row values ​​in a column, and then select specific columns, since the column contains other redundant data that ultimately leads to a memory error.

+5
source share
1 answer

try the following:

 # create mock dataframe df = pd.DataFrame( {'alpha':['a','a','b','b','c','e','f','g']}) # use replace with a regex to set characters dz to None pd.get_dummies(df.replace({'[^ac]':None},regex =True)) 

output:

  alpha_a alpha_b alpha_c 0 1 0 0 1 1 0 0 2 0 1 0 3 0 1 0 4 0 0 1 5 0 0 0 6 0 0 0 7 0 0 0 
+1
source

Source: https://habr.com/ru/post/1235138/


All Articles