Tesseract ocr training with predefined font images

I am trying to make ASCII recognition of ASCII strings from an image. I use the Tesseract3 library, but I have a slight problem with correct recognition, so I need to train it with a new character set (specifically). I already opened this HOW-TO: TrainingTesseract3 , but there are some unnecessary procedures in the manual that I don’t need, due to the simplicity of my set of test images. My image data set contains only 1 liner , where each of the ASCII characters is the same on all images (no rotation, no scaling), but has a variable distance (only horizontal) between the characters in the line.

How can I use font images to teach recognition algorithm?

+4
source share
1 answer

Sir will just take which font you want to train, and then write a letter or number in a notebook (I think 5 repetitions / letters), save it as a tiff file. If you want to train him, use either https://code.google.com/p/serak-tesseract-trainer/ or http://vietocr.sourceforge.net/training.html .

+4
source

Source: https://habr.com/ru/post/1543430/


All Articles