How to read multiple files and combine them into one pandas data frame?

I want to read several files located in one directory and then combine them into one pandas data frame.

It works if I do it as follows:

import pandas as pd

df1 = pd.read_csv("data/12015.csv")
df2 = pd.read_csv("data/22015.csv")
df3 = pd.read_csv("data/32015.csv")

df = pd.concat([df1, df2, df3])

However, I want to use a more elegant solution, which would be especially useful if the number of files is more than 3.

I tried this approach, however, I do not know how to apply concatinside the for loop.

import pandas as pd
import os
from os import path

files = [x for x in os.listdir("data") if path.isfile("data"+os.sep+x)]

for f in files:
    df = pd.read_csv("data/"+f)
+4
source share
2 answers

You can use the list view to create a DataFrames list for concat and then call pd.concat()on that list. Example -

import pandas as pd
import os
from os import path
dfs = [pd.read_csv(path.join('data',x)) for x in os.listdir("data") if path.isfile(path.join("data",x))]
df = pd.concat(dfs)

os.path.join(), , .

+4

:

dfs = pd.concat([pd.read_csv("data/" + f) for f in files])

:

df_list = []
bad_files = []
for f in files:
    try:
        df_list.append(pd.read_csv("data/" + f))
    except:
        bad_files.append(f)
dfs = pd.concat(df_list)
+2

Source: https://habr.com/ru/post/1609834/


All Articles