Tesseract ocr training with predefined font images

Question

Tesseract ocr training with predefined font images

I am trying to make ASCII recognition of ASCII strings from an image. I use the Tesseract3 library, but I have a slight problem with correct recognition, so I need to train it with a new character set (specifically). I already opened this HOW-TO: TrainingTesseract3 , but there are some unnecessary procedures in the manual that I don’t need, due to the simplicity of my set of test images. My image data set contains only 1 liner , where each of the ASCII characters is the same on all images (no rotation, no scaling), but has a variable distance (only horizontal) between the characters in the line.

How can I use font images to teach recognition algorithm?

+4

pattern-matching ocr ascii tesseract training-data

Tomil Jun 05 '14 at 15:11

source share

1 answer

Jeffrey orcena · Answer 1 · 2014-06-11T00:43:12+0000

Sir will just take which font you want to train, and then write a letter or number in a notebook (I think 5 repetitions / letters), save it as a tiff file. If you want to train him, use either https://code.google.com/p/serak-tesseract-trainer/ or http://vietocr.sourceforge.net/training.html .

Tesseract ocr training with predefined font images

More articles: