Pandas Multiple Column Names

I am creating a dataframe from csv . I went through the docs, a few SO posts, links since I just started Pandas but didn't get it. Csv has several columns with the same name, say a .

So, after forming the dataframe and when I do df['a'] , what value will it return? It does not return all values.

Also, only one of the values ​​will contain the remainder of the None string. How can I get this column?

+5
source share
1 answer

corresponding parameter mangle_dupe_cols

from docs

 mangle_dupe_cols : boolean, default True Duplicate columns will be specified as 'X.0'...'X.N', rather than 'X'...'X' 

by default, all columns of 'a' get the name 'a.0'...'a.N' , as indicated above.

if you used mangle_dupe_cols=False , importing this csv will result in an error.

you can get all your columns with

 df.filter(like='a') 

demonstration

 from StringIO import StringIO import pandas as pd txt = """a, a, a, b, c, d 1, 2, 3, 4, 5, 6 7, 8, 9, 10, 11, 12""" df = pd.read_csv(StringIO(txt), skipinitialspace=True) df 

enter image description here

 df.filter(like='a') 

enter image description here

+4
source

Source: https://habr.com/ru/post/1258052/


All Articles