An OCR that provides HTML overlay on an image?

I am looking for OCR software that displays HTML overlay on an image. I am currently using an unnamed product. It has an OCR function that will make an embedded OCR of a PDF document with images.

The built-in OCR is very convenient, it allows you to search for a PDF document with images for text. Also, the text can be directly selected in the document, the OCR text is aligned with the main image. Unfortunately, I cannot export or store the embedded OCR from an unnamed product.

Is there any other software around which you can run and export embedded OCR? I would be particularly interested in exporting to HTML consisting of positional paragraphs that are aligned with the main image.

See also:
https://stackoverflow.com/questions/11404805/ocr-and-the-location-of-the-image-where-the-scanned-document-came-from

+6
source share
2 answers

I found the Google Drive API useful when OCR is required. He is trying to preserve the document format, which, of course, can be exported as HTML.

Take a look at the following links:

+2
source

I have a possible solution for you. But this particular solution has some drawbacks that may prevent you from completing the goal.

First convert the image file to pdf: http://finereader.abbyyonline.com Then convert the pdf to html at http://document.online-convert.com/convert-to-html

This solution works for paper-sized things, and the end result is html with image overlay. If you only want html with image formatting, just make the images completely transparent.

+1
source

Source: https://habr.com/ru/post/944819/


All Articles