Transpose all CSV files in a folder

I was helped the last time I asked a question on this site regarding csv batch files in a folder using glob.glob() with Python. I am trying to use it this time to transfer all csv files to a folder. The script below only processes the last file and stops. What am I doing wrong?

 import csv import os import glob directory = raw_input ("INPUT Folder") output = raw_input("OUTPUT Folder:") in_files = os.path.join(directory, '*.csv') for in_file in glob.glob(in_files): with open(in_file) as input_file: reader = csv.reader(input_file) cols = [] for row in reader: cols.append(row) filename = os.path.splitext(os.path.basename(in_file))[0] + '.csv' with open (os.path.join(output, filename), 'wb') as output_file: writer = csv.writer(output_file) for i in range(len(max(cols, key=len))): writer.writerow ([(c[i] if i<len(c) else '') for c in cols]) 
+4
source share
4 answers

You need to separate the "output" part of the code so that it runs once for each iteration of the for in_file loop:

 import csv import os import glob directory = raw_input ("INPUT Folder") output = raw_input("OUTPUT Folder:") in_files = os.path.join(directory, '*.csv') for in_file in glob.glob(in_files): with open(in_file) as input_file: reader = csv.reader(input_file) cols = [] for row in reader: cols.append(row) # "outdent" this code so it only needs to run once for each in_file filename = os.path.splitext(os.path.basename(in_file))[0] + '.csv' # Indent this to the same level as the rest of the "for in_file" loop! with open (os.path.join(output, filename), 'wb') as output_file: writer = csv.writer(output_file) for i in range(len(max(cols, key=len))): writer.writerow ([(c[i] if i<len(c) else '') for c in cols]) 

In your version, this code is run only once, after the for in_file loop has completed and, therefore, only displays cols data remaining after the last iteration of this loop.

I also "exceeded" the filename = ... statement at the for in_file level, since this needs to be done only once for each in_file , and not once for each row for each in_file .

+5
source

You can get great mileage when manipulating data using pandas :

 import os import pandas as pd for filename in os.listdir('.'): # We save an augmented filename later, # so using splitext is useful for more # than just checking the extension. prefix, ext = os.path.splitext(filename) if ext.lower() != '.csv': continue # Load the data into a dataframe df = pd.DataFrame.from_csv(filename, header=None, index_col=None, parse_dates=False) # Transpose is easy, but you could do TONS # of data processing here. pandas is awesome. df_transposed = df.T # Save to a new file with an augmented name df_transposed.to_csv(prefix+'_T'+ext, header=True, index=False) 

The os.walk version os.walk not much different if you need to fall into subfolders as well.

0
source

Here is a working one:

should google for an hour, but worked and tested on python33

 import csv import os import glob directory = 'C:\Python33\csv' output = 'C:\Python33\csv2' in_files = os.path.join(directory, '*.csv') for in_file in glob.glob(in_files): with open(in_file) as input_file: reader = csv.reader(input_file) cols = [] for row in reader: cols.append(row) # "outdent" this code so it only needs to run once for each in_file filename = os.path.splitext(os.path.basename(in_file))[0] + '.csv' # Indent this to the same level as the rest of the "for in_file" loop! with open (os.path.join(output, filename), 'w') as output_file: writer = csv.writer(output_file) for i in range(len(max(cols, key=len))): writer.writerow ([(c[i] if i<len(c) else '') for c in cols]) 
0
source

in_files returns only one result in this format. Try to return the list:

 in_files = [f for f in os.listdir(directory) if f.endswith('.csv')] 
-one
source

Source: https://habr.com/ru/post/1502513/


All Articles