Remove last two characters from column names of all columns in Dataframe - Pandas

I join two data files (a, b) with identical column / column names using the user ID key, and when joining, I had to specify suffix characters so that it could be created. Below is the command I used,

a.join(b,how='inner', on='userId',lsuffix="_1")

If I do not use this suffix, I get an error. But I do not want the column names to change, because it causes a problem when starting another analysis. Therefore, I want to remove this "_1" character from all the column names of the resulting frame. Can someone suggest me an efficient way to remove the last two characters of the names of all columns in a Pandas frame?

thank

+4
source share
2

:

df.columns = pd.Index(map(lambda x : str(x)[:-2], df.columns))

:

df.rename(columns = lambda x : str(x)[:-2])

, , . - .

, .

+9

str.rstrip

In [214]: import functools as ft

In [215]: f = ft.partial(np.random.choice, *[5, 3])

In [225]: df = pd.DataFrame({'a': f(), 'b': f(), 'c': f(), 'a_1': f(), 'b_1': f(), 'c_1': f()})

In [226]: df
Out[226]:
   a  b  c  a_1  b_1  c_1
0  4  2  0    2    3    2
1  0  0  3    2    1    1
2  4  0  4    4    4    3

In [227]: df.columns = df.columns.str.rstrip('_1')

In [228]: df
Out[228]:
   a  b  c  a  b  c
0  4  2  0  2  3  2
1  0  0  3  2  1  1
2  4  0  4  4  4  3

, - (, , ), str.extract ,

In [216]: df = pd.DataFrame({f'{c}_{i}': f() for i in range(3) for c in 'abc'})

In [217]: df
Out[217]:
   a_0  b_0  c_0  a_1  b_1  c_1  a_2  b_2  c_2
0    0    1    0    2    2    4    0    0    3
1    0    0    3    1    4    2    4    3    2
2    2    0    1    0    0    2    2    2    1

In [223]: df.columns = df.columns.str.extract(r'(.*)_\d+')[0]

In [224]: df
Out[224]:
0  a  b  c  a  b  c  a  b  c
0  1  1  0  0  0  2  1  1  2
1  1  0  1  0  1  2  0  4  1
2  1  3  1  3  4  2  0  1  1

df.columns.str

0

Source: https://habr.com/ru/post/1693133/


All Articles