I have a requirement to convert Excel (2010) files to csv. I am currently using Excel Interop to open and SaveAs csv, which works well. However, Interop has some problems in the environment where we use it, so I'm looking for another solution.
I found a way to work with Excel files without interaction, this is to use the OpenXML SDK. I got the code together to repeat all the cells on each sheet and just writes them to another file in CSV.
One of the problems that I have encountered is handling empty rows and cells. It seems that with this code the empty rows and cells completely do not exist, so I can not find out about them. Is it possible to go through all rows and cells, including spaces?
string filename = @"D:\test.xlsx"; string outputDir = Path.GetDirectoryName(filename); //-------------------------------------------------------- using (SpreadsheetDocument document = SpreadsheetDocument.Open(filename, false)) { foreach (Sheet sheet in document.WorkbookPart.Workbook.Descendants<Sheet>()) { WorksheetPart worksheetPart = (WorksheetPart) document.WorkbookPart.GetPartById(sheet.Id); Worksheet worksheet = worksheetPart.Worksheet; SharedStringTablePart shareStringPart = document.WorkbookPart.GetPartsOfType<SharedStringTablePart>().First(); SharedStringItem[] items = shareStringPart.SharedStringTable.Elements<SharedStringItem>().ToArray(); // Create a new filename and save this file out. if (string.IsNullOrWhiteSpace(outputDir)) outputDir = Path.GetDirectoryName(filename); string newFilename = string.Format("{0}_{1}.csv", Path.GetFileNameWithoutExtension(filename), sheet.Name); newFilename = Path.Combine(outputDir, newFilename); using (var outputFile = File.CreateText(newFilename)) { foreach (var row in worksheet.Descendants<Row>()) { StringBuilder sb = new StringBuilder(); foreach (Cell cell in row) { string value = string.Empty; if (cell.CellValue != null) { // If the content of the first cell is stored as a shared string, get the text // from the SharedStringTablePart. Otherwise, use the string value of the cell. if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString) value = items[int.Parse(cell.CellValue.Text)].InnerText; else value = cell.CellValue.Text; } // to be safe, always use double quotes. sb.Append(string.Format("\"{0}\",", value.Trim())); } outputFile.WriteLine(sb.ToString().TrimEnd(',')); } } } }
If I have the following excel file data:
one,two,three ,, last,,row
I will get the following CSV (which is wrong):
one,two,three last,row
source share