The main goals are as follows:
1) Apply StandardScalerto continuous variables
2) Apply LabelEncoderand OnehotEncoderfor categorical variables
Continuous variables must be scaled, but at the same time, a pair of categorical variables also has an integer type. Application StandardScalerwill lead to undesirable effects.
On the other hand, it StandardScalerwill scale integer categorical variables, which is also not what we are.
Since continuous and categorical variables are mixed in one PandasDataFrame, what is the recommended workflow to solve this problem?
The best example to illustrate my point is the Kaggle Bike Sharing Demand dataset , where seasonthey weatherare integer categorical variables.
source
share