I am trying to extract numbers from sudoku board. after detecting the board, its angles and transformation, I was left with a fairly lined image of the board only. Now I am trying to recognize numbers using Tesseract android, Tess-Two. I divided the image into 9 parts by
currentCell = undistortedThreshed.submat(rect);
where rect is the rectangle surrounding the image.
Now to the recognition of numbers.
Some numbers, such as 4, are well recognized. Some, mostly 6,7,8, are recognized as 0 or nothing.
I want to help tesseract as much as possible by clearing the currentCell image. at the moment it looks like this
. (also tried without an inverted threshold). I want to get rid of the white lines (sudoku lines). I tried something like this (taken from here )
Imgproc.Canny(currentCell, currentCell, 80, 90); Mat lines = new Mat(); int threshold = 50; int minLineSize = 5; int lineGap = 20; Imgproc.HoughLinesP(currentCell, lines, 1, Math.PI / 180, threshold, minLineSize, lineGap); for (int x = 0; x < lines.cols() && x < 1; x++) { double[] vec = lines.get(0, x); double x1 = vec[0], y1 = vec[1], x2 = vec[2], y2 = vec[3]; Point start = new Point(x1, y1); Point end = new Point(x2, y2); Core.line(currentCell, start, end, new Scalar(255), 10); }
but he draws nothing, I tried to spoil the width and color of the line, but still nothing. I tried to draw a line on a large image, nothing works on an uncrop image.
Any suggestions?
EDIT
For some reason, it cannot find any lines. Here's what this image looks like after applying canny to it
but HoughLines does not detect any rows. I tried both HoughLines and HoughLinesP with different values, as shown in the OpenCV documentation, but nothing works ... These are pretty obvious lines. What am I doing wrong? Thanks!