Convert Word docx to Excel using OpenXML

Is there a way to convert a Word document where I have some tables in an Excel file? It would be very helpful to convert the tables.

Something like that:

  • Open a Word document using OpenXML
  • Find all tags xml tags
  • Copy xml tags
  • Create Excel File
  • Paste xml tags with a table from Word into a new Excel file.

I mean

void OpenWordDoc(string filePath) { _documentWord = SpreadsheetDocument.Open(filePath, true); } List<string> GetAllTablesXMLTags() { //find and copy } List<string> CreateExcelFile(string filePath) { TemplateExcelDocument excelDocument = new TemplateExcelDocument(); _documentExcel = excelDocument.CreatePackage(filePath); } void InsertXmlTagsToExcelFile(string filePath) { CreateExcelFiles(filePath); var xmlTable = GetAllTablesXMLTags(); // ... insert to _documentExcel } 
+6
source share
2 answers

Your actions are correct.

I would like to share some sdk docs, hope this can help to some extent:

Open XML SDK 2.5 for Office

When processing word tables:

Working with WordprocessingML Tables (Open XML SDK)

When processing excel tables:

Working with a Common String Table (Open XML SDK)

Working with SpreadsheetML Tables (Open XML SDK)

+2
source

to get all the tables in the docx file, you can use the following code:

 using System; using Independentsoft.Office; using Independentsoft.Office.Word; using Independentsoft.Office.Word.Tables; namespace Sample { class Program { static void Main(string[] args) { WordDocument doc = new WordDocument("c:\\test.docx"); Table[] tables = doc.GetTables(); foreach (Table table in tables) { //read data } } } } 

And to write them to an excel file, you have to do this for each cell:

  app.Visible = false; workbooks = app.Workbooks; workbook = workbooks.Add(XlWBATemplate.xlWBATWorksheet); sheets = workbook.Worksheets; worksheet = (_Worksheet)sheets.get_Item(1); excel(row, column, "value"); workbook.Saved = true; workbook.SaveAs(output_file); app.UserControl = false; app.Quit(); 

and finally, the excel function is as follows:

  public void excel(int row, int column, string value) { worksheet.Cells[row, column] = value; } 

You can also use the CSV or HTML format to create an excel file. to do this, simply create an example.xlsx file with this content for the CSV comma:

col1, col2, col3, col4 \ n

val1, val2, val3val4 \ n

or in HTML format:

 <table> <tr> <td>col1</td> <td>col2</td> <td>col3</td> </tr> <tr> <td>val1</td> <td>val2</td> <td>val3</td> </tr> </table> 
+1
source

Source: https://habr.com/ru/post/944619/


All Articles