Image Processing - Rotate and Optical Character Recognition

Good morning,

Today I want to take care of the topic "Image manipulation in C ++".

So far I can filter out all the noisy things from the image and change the color to black and white.

But now I have two questions.

First question :
Below you see a screenshot. What is the best way to learn how to rotate text. In the end, it would be nice if the text is horizontal. Does anyone have a good link or example.

enter image description here


Second question:
How to proceed? Do you think I should send the image to Optical Character Recognition (a) or should I filter out each letter (b) ?

If the answer is (a) , then what is the smallest ocr lib? All the libraries that I have found so far seem crowded and difficult to implement in an existing project. (e.g. gocr or tesseract)

If the answer is (b) , what is the best way to save each letter as its own image? Shoul I'm looking for a white pixel, what is the transition from pixel to pixel, save coordinates in a 2D array? What is the letter "i";)


Thanks to everyone who will help me find my way!
Sorry for the weird english. I'm still noob language :-)

+4
source share
3 answers

The common name for the problem in your first question is “Skew Correction”

enter image description here

You can use Google to do this (lots of links). Good paper here showing, for example, how to do this:

enter image description here

An easy way to get started (but not as good as previously mentioned) is to do a Core Component Analysis :

enter image description here

+4
source

For your first question:

First remove any “specifications” of noisy white pixels that are not part of the letter sequence. A gentle low-pass filter (pixel color = average surrounding pixel), followed by clamping the pixel values ​​to pure black or pure white. This should get rid of the small “dot” under the “a” symbol in your image and any other specifications.

Now find the following pixels:

xMin = white pixel with the lowest x value (white pixel closest to the left edge) xMax = white pixel with the largest x value (white pixel closest to the right edge) yMin = white pixel with the lowest y value (white pixel closest to the top edge) yMax = white pixel with the largest y value (white pixel closest to the bottom edge) with these four pixel values, form a bounding box: Rect(xMin, yMin, xMax, yMax); compute the area of the bounding box and find the center. using the center of the bounding box, rotate the box by N degrees. (You can pick N: 1 degree would be an ok value). Repeat the process of finding xMin,xMax,yMin,yMax and recompute the area Continue rotating by N degrees until you've rotated K degrees. Also rotate by -N degrees until you've rotated by -K degrees. (Where K is the max rotation... say 30 degrees). At each step recompute the area of the bounding box. 

The rotation that creates the bounding rectangle with the smallest area is probably rotational, aligning letters parallel to the bottom edge (horizontal alignment).

+1
source

You can measure the height of each white pixel from the bottom and find how much the text relies. This is a very simple approach, but it worked great for me when I tried it.

0
source

Source: https://habr.com/ru/post/1347428/


All Articles