Inconsistent labeling in sklearn LabelEncoder?

I applied LabelEncoder()on a data frame that returns the following:

enter image description here

order/new_cart have different numbers with a label, such as 70, 64, 71, etc

Is this an inconsistent labeling, or am I doing something wrong?

+4
source share
2 answers

LabelEncoder works with one-dimensional arrays. If you apply it to multiple columns, it will be consistent in the columns, but not across the columns.

As a workaround, you can convert the dataframe to a one-dimensional array and call LabelEncoder for that array.

Suppose this is a dataframe:

df
Out[372]: 
   0  1  2
0  d  d  a
1  c  a  c
2  c  c  b
3  e  e  d
4  d  d  e
5  d  b  e
6  e  e  b
7  a  e  b
8  b  c  c
9  e  a  b

With ravel and then reformatting:

pd.DataFrame(LabelEncoder().fit_transform(df.values.ravel()).reshape(df.shape), columns = df.columns)
Out[373]: 
   0  1  2
0  3  3  0
1  2  0  2
2  2  2  1
3  4  4  3
4  3  3  4
5  3  1  4
6  4  4  1
7  0  4  1
8  1  2  2
9  4  0  1

Edit:

, LabelEncoder.

le = LabelEncoder()
df2 = pd.DataFrame(le.fit_transform(df.values.ravel()).reshape(df.shape), columns = df.columns)

le.classes_ ( 0).

le.classes_
Out[390]: array(['a', 'b', 'c', 'd', 'e'], dtype=object)

, dict:

dict(zip(le.classes_, np.arange(len(le.classes_))))
Out[388]: {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4}

, dict:

le.transform('c')
Out[395]: 2
+4

LabelEncoder DataFrame.

- apply fit_transform, . , :

labeled_df = String_df.apply(LabelEncoder().fit_transform)
  • LabelEncoder
  • apply, fit_transform. DataFrame fit_transform , . :
    A. ( ) B. .

, , fit_transform, LabelEncoder .

, , LabelEncoder .

fit_transform. :

encoder = LabelEncoder()
all_values = String_df.values.ravel() #convert the dataframe to one long array
encoder.fit(all_values)
labeled_df = String_df.apply(encoder.transform)
+2

Source: https://habr.com/ru/post/1648229/


All Articles