Automate Doc to PDF in C #

I have about 200 text documents that I need to work in pdf format.

Obviously, I cannot retrain them one by one, since at first it will take a lot of time, and secondly, I am sure that this is not a good practice.

I need to find a way to automate this conversion, since we will need it again and again.

I use C #, but the solution does not have to be in C #, but it is preferable.

I looked through several libraries, such as PDfCreator, Office 2007 add-in, ITextSharp, etc., and there is no clear answer on the forums.

PDFCreator has a C # sample, but it only works with txt files. Office 2007 add in does not have the ability to lock documents that are necessary for automation.

Has anyone implemented such a scenario before? I would like you to hear your suggestions.

Thank you in advance

considers

+4
source share
7 answers
+3
source

I do this to automate the conversion of our docx and docx documents to pdf:

private bool ConvertDocument(string file) { object missing = System.Reflection.Missing.Value; OW.Application word = null; OW.Document doc = null; try { word = new OW.Application(); word.Visible = false; word.ScreenUpdating = false; Object filename = (Object)file; doc = word.Documents.Open(ref filename, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing); doc.Activate(); if (Path.GetExtension(file) == ".docx") file = file.Replace(".docx", ".pdf"); else file = file.Replace(".doc", ".pdf"); object fileFormat = OW.WdSaveFormat.wdFormatPDF; doc.ExportAsFixedFormat(file, OW.WdExportFormat.wdExportFormatPDF, false, OW.WdExportOptimizeFor.wdExportOptimizeForPrint, OW.WdExportRange.wdExportAllDocument, 1, 1, OW.WdExportItem.wdExportDocumentContent, true, true, OW.WdExportCreateBookmarks.wdExportCreateNoBookmarks, true, true, false, ref missing); } catch(Exception ex) { return false; } finally { if (doc != null) { object saveChanges = OW.WdSaveOptions.wdDoNotSaveChanges; ((OW._Document)doc).Close(ref saveChanges, ref missing, ref missing); doc = null; } if (word != null) { ((OW._Application)word).Quit(ref missing, ref missing, ref missing); word = null; } } return true; } 

where OW is an alias for Microsoft.Office.Interop.Word.

+3
source

Have you checked this MSDN article ?


Edit:

Please note that these β€œHow To” samples will not work as they are, because:

  • For some reason, it runs program parameters ( ConvertDocCS.exe [sourceDoc] [targetDoc] [targetFormat] ) on line # 77, # 81 and # 82.
  • I converted the project to VS 2010 and had to re-link to Microsoft.Office.Core . This COM link is called the Microsoft Office 12.0 Object Library .
  • The sample does not contain a relative path.

I am sure that you will be able to overcome these obstacles :)


Last thing. If you are working with .NET 4, you do not need to send all these annoying Missing.Value due to the miracle of additional parameters.

+2
source

You can try Aspose.Words for .NET to convert DOC files to PDF . It can be used in any .NET application with C # or VB.NET, like any other .NET assembly. It also works on any Windows OS and on 32/64-bit systems.

Disclosure: I work as an evangelist developer at Aspose.

+1
source

As HuBeZa said, if Word is installed on your workstation, you can use Word Automation to open files one at a time and save them in PDF format. All you need is a link to the COM component "Microsoft Word Object Library" and a game with the classes of this assembly.

The execution time is probably a little long, but your conversions will be automated.

0
source

We can install fonts to automate the word, I applied a single font to all the generated documents from my solution for the same application - and saved the time to manually go to each template and set the font separately for each tag and title, etc.

  using (WordprocessingDocument wordProcessingDocument = WordprocessingDocument.Open(input, true)) { // Get all content control elements List<DocumentFormat.OpenXml.OpenXmlElement> elements = wordProcessingDocument.MainDocumentPart.Document.Body.ToList(); // Get and set the style properties of each content control foreach (var itm in elements) { try { List<RunProperties> list_runProperties = itm.Descendants<RunProperties>().ToList(); foreach (var item in list_runProperties) { if (item.RunFonts == null) item.RunFonts = new RunFonts(); item.RunFonts.Ascii = "Courier New"; item.RunFonts.ComplexScript = "Courier New"; item.RunFonts.HighAnsi = "Courier New"; item.RunFonts.Hint = FontTypeHintValues.ComplexScript; } } catch (Exception) { //continue for other tags in document //throw; } } wordProcessingDocument.MainDocumentPart.Document.Save(); } 
0
source

I think there is no direct answer to this !!! but maybe through a workaround that I suggest using imagemagik or some kind of library and see if it can provide images of your doc word and then use these images in itextsharp to create pdf

-2
source

Source: https://habr.com/ru/post/1340184/


All Articles