How to use OpenCV + Tesseract for accurate text recognition in Android?

Question

How to use OpenCV + Tesseract for accurate text recognition in Android?

I try to use OpenCV (Android) to process an image taken with a camera, and then pass it to Tesseract to recognize text (numbers), but I do not get good results until the images are (almost without noise) good. Currently, I am processing the received images as follows: 1. Applying Gaussian blur. 2. Adaptive threshold: for binarization of the image. 3. Invert colors to make a black background. Then pass the processed image to Tesseract.

But I do not get good results.

Please imagine what steps / measures I can take to process the image before moving on to Tesseract or during the processing step in Tesseract.

Also, are there any other better libraries in Android for this?

+4

android opencv ocr tesseract

arorak Apr 29 '14 at 10:06

source share

1 answer

AmmarCSE · Answer 1 · 2014-04-29T10:20:48+0000

You can isolate / detect characters in images. This can be done using powerful algorithms such as Transform Width Transform .

The following steps worked well with me:

Get grayscale images.
Perform canny edge detection on the grayscale image.
Apply Gaussian blur on grayscale image (save in a separate matrix)
Input matrices from steps 2 and 3 in the SWT algorithm
The resulting image is Binarize (threshhold).
Upload the image to tesseract.

: 4 ++ , Android JNI . , - , . , , .

How to use OpenCV + Tesseract for accurate text recognition in Android?

More articles: