Pandas Multiple Column Names

Question

Pandas Multiple Column Names

I am creating a dataframe from csv . I went through the docs, a few SO posts, links since I just started Pandas but didn't get it. Csv has several columns with the same name, say a .

So, after forming the dataframe and when I do df['a'] , what value will it return? It does not return all values.

Also, only one of the values will contain the remainder of the None string. How can I get this column?

+5

python python-2.7 pandas csv

vks Oct 11 '16 at 21:19

source share

1 answer

piRSquared · Accepted Answer · 2016-10-11T21:22:15+0000

corresponding parameter mangle_dupe_cols

from docs

 mangle_dupe_cols : boolean, default True Duplicate columns will be specified as 'X.0'...'X.N', rather than 'X'...'X'

by default, all columns of 'a' get the name 'a.0'...'a.N' , as indicated above.

if you used mangle_dupe_cols=False , importing this csv will result in an error.

you can get all your columns with

 df.filter(like='a')

demonstration

 from StringIO import StringIO import pandas as pd txt = """a, a, a, b, c, d 1, 2, 3, 4, 5, 6 7, 8, 9, 10, 11, 12""" df = pd.read_csv(StringIO(txt), skipinitialspace=True) df

 df.filter(like='a')

Pandas Multiple Column Names

More articles: