I have a very large CSV dataset (several million records). I have already filtered and massaged and split this list into customer specifications. All this was done in Python3.3
The final requirement is that these splitting lists are saved in Excel format. They have a utility that imports an Excel spreadsheet (in a specific format) into its database after doing some calculations and checking for existing duplicates in the database. My problem is that their utility only works with Excel 2003.xls files ... I did not know this ahead of time.
So, I can already write data in the correct format for Excel 2007 using OpenPyXl, but these files will not work. I can write CSV files, but they do not work either, their importer needs xls files. Perhaps there is a way to batch convert all files from Excel 2007 xlsx format to xls format or from csv format to xls format? There are thousands of files, so this cannot be done manually.
The best thing to do is to output them in the correct format, but I cannot find a python 3 compatible method that will work with Excel 2003 format. Xlwt is only python 2.x.
Does anyone have any suggestions on how I can end this?
EDIT: It was like a solution.
EDIT2: Added workbook as suggested by stenci.
import os import errno import glob import time import win32com.client def xlsx_to_xls(path): xlsx_files = glob.glob(path+'\\*.xlsx') if len(xlsx_files) == 0: raise RuntimeError('No XLSX files to convert.') xlApp = win32com.client.Dispatch('Excel.Application') for file in xlsx_files: xlWb = xlApp.Workbooks.Open(os.path.join(os.getcwd(), file)) xlWb.SaveAs(os.path.join(os.getcwd(), file.split('.xlsx')[0] + '.xls'), FileFormat=1) xlWb.Close() xlApp.Quit() time.sleep(2)
source share