Tesseract Number Recognition: what are the most common OCR options

Question

Tesseract Number Recognition: what are the most common OCR options

Here is my OCR code for recognizing numbers through the Tesseract engine:

Tesseract* tesseract = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"]; //set the tesseract variables [tesseract setVariableValue:@"0123456789" forKey:@"tessedit_char_whitelist"]; NSString * temp = @"7"; [tesseract setVariableValue:temp forKey:@"tessedit_pageseg_mode"]; [tesseract setImage:argImage]; [tesseract recognize]; m_convertedText = [[tesseract recognizedText] copy];

Using the above, I get some images that are recognized correctly. However, sometimes I get 5 instead of 8, 6 instead of 5 and so on. My input images are pretty perfect - pure black and white after binarization.

Are there any other Tesseract options that I am missing to specify? I see that there are over 600 options and very rare documentation.

The best I could find was this site , which lists all the options, but not yet very clear for OCR beginners.

If someone has achieved 100 percent accuracy using OCR digits with tesseract, this will be really helpful.

+4

ios ocr image-recognition tesseract

Nirav bhatt Sep 11 '13 at 8:50

source share

No one has answered this question yet.

See related questions:

thirteen

OCR: image to text?

eight