I have a large pandas framework with approximately 80 columns. Each of the 80 columns in the dataframe reports daily traffic statistics for websites (columns are websites).
Since I donβt want to work with raw traffic statistics, I prefer to normalize all my columns (except the first, which is the date). Either from 0 to 1, or (even better) from 0 to 100.
Date AB ... 10/10/2010 100.0 402.0 ... 11/10/2010 250.0 800.0 ... 12/10/2010 800.0 2000.0 ... 13/10/2010 400.0 1800.0 ...
Saying, I wonder what normalization is applied. Min-Max scaling versus z-Score normalization (standardization)? Some of my columns have strong outliers. It would be great to have an example. I regret that I can not provide complete data.
source share