Python Pandas add Column CSV file name

Question

Python Pandas add Column CSV file name

My python code is working correctly in the example below. My code combines a directory of CSV files and matches the headers. However, I want to take another step: how to add a column that adds the name of the CSV file that was used?

import pandas as pd import glob globbed_files = glob.glob("*.csv") #creates a list of all csv files data = [] # pd.concat takes a list of dataframes as an agrument for csv in globbed_files: frame = pd.read_csv(csv) data.append(frame) bigframe = pd.concat(data, ignore_index=True) #dont want pandas to try an align row indexes bigframe.to_csv("Pandas_output2.csv")

+9

python pandas dataframe glob

specmer Jan 25 '17 at 17:16

source share

2 answers

Mike answer above works fine. In case any googler encounters the following error:

 >>> TypeError: cannot concatenate object of type "<type 'str'>"; only pd.Series, pd.DataFrame, and pd.Panel (deprecated) objs are valid

This is possible because the delimiter is not correct. I used a custom CSV file, so the delimiter was ^ . Because of this, I needed to enable the delimiter in the pd.read_csv call.

 import os for csv in globbed_files: frame = pd.read_csv(csv, sep='^') frame['filename'] = os.path.basename(csv) data.append(frame)

0

Daniel Butler Apr 29 '19 at 17:14

source share

Mike müller · Accepted Answer · 2017-01-25T17:19:49+0000

This should work:

 import os for csv in globbed_files: frame = pd.read_csv(csv) frame['filename'] = os.path.basename(csv) data.append(frame)

frame['filename'] creates a new column named filename and os.path.basename() turns the path as /a/d/c.txt into the name of the c.txt file.

Python Pandas add Column CSV file name

More articles: