How to use AcroTextExtractor.exe programmatically?

I am trying to extract batch text from PDF files. I tried many libraries, and Adobe Reader seems to you the most accurate text extractor.

I noticed the AcroTextExtractor.exe file in the folder where Adobe Reader is installed. It seems to seem promising, and a Google search reveals that this file is part of the PDF text conversion program.

How to call this file from the command line to extract text?

+6
source share
1 answer

I also wanted to use this for the same scenario.

I did an experiment to check if I can check the command line that can be seen when running AcroTextExtractor.exe .

I took a large PDF file and opened it in Adobe Acrobat Reader DC version 2018.009.20050. Then I saved it as text (File | Save as other text), and when Reader generated the text file (successfully), I checked all the running processes in the task manager, sysinternals Process Explorer and from WMI to Powershell.

Unfortunately, I could not find the process started using the path, including AcroTextExtractor.exe ; so I could not capture the command line.

It could be a red herring.

+1
source

Source: https://habr.com/ru/post/984891/


All Articles