OCR and similarity of characters

Question

OCR and similarity of characters

I am currently working on some kind of optical character recognition (OCR) system. I already wrote a script to extract each character from the text and clear (most) of the irregularities from it. I also know the font. Images that I have now, for example:

M ( http://i.imgur.com/oRfSOsJ.png (font) and http://i.imgur.com/UDEJZyV.png (scanned))

K ( http://i.imgur.com/PluXtDz.png (font) and http://i.imgur.com/TRuDXSx.png (scanned))

C ( http://i.imgur.com/wggsX6M.png (font) and http://i.imgur.com/GF9vClh.png (scanned))

For all these images, I already have a kind of binary matrix (1 for black, 0 for white). Now I wondered if there was any mathematical projection formula to see the similarities between these matrices. I do not want to rely on the library because it was not my task.

I know this question may seem a bit vague and there are similar questions, but I'm looking for a method, not for a package, and so far I have not been able to find any comments regarding the method. The reason this question is vague is because I really have nothing to start. What I want to do is actually described here on Wikipedia:

Matrix matching involves comparing an image with a stored glyph on a pixel basis; it is also known as pattern matching or pattern recognition. [9] This depends on the correct selection of the input glyph from the rest of the image and the preservation of the saved glyph in a similar font and at the same scale. This method works best with typewritten text and does not work well when new fonts are found. This is a method based on early use of optical recognition based on photocells, not directly. ( http://en.wikipedia.org/wiki/Optical_character_recognition#Character_recognition )

If anyone could help me with this, I would really appreciate it.

+3

math matrix ocr projection

JohannesB Apr 01 '14 at 12:44

source share

1 answer

Spektre · Accepted Answer · 2014-04-05T09:29:00+0000

for recognition or classification Most OCRs use neural networks

They must be properly configured for the desired task, such as the internal architecture of the interconnects, and so on. Also, the problem with neural networks is that they must be properly prepared, which is rather difficult to do correctly, because you will need to know what the size of the corresponding set of training materials is (therefore, it contains enough information and does not overload it). If you do not have experience with neural networks, do not do this if you need to implement it yourself.

There are other ways to compare patterns.

vector approach
- polygon image (edges or borders)
- compare the similarity of polygons (surface area, perimeter, shape, ....)
pixel approach
You can compare images based on:
- bar chart
- DFT / DCT Spectral Analysis
- the size
- number of occupied pixels per row
- starting position of the occupied pixel in each row (left)
- the final position of the occupied pixel in each row (from righ)
- these 3 parameters can also be performed for strings
- list of points of interest (points where there are some changes, such as intensity hit, edge, ...)
You create a list of functions for each character tested and compare it with your font, and then your character is the closest match. Also, the list of functions can be scaled to some fixed size (for example, 64x64 ), so recognition becomes scale invariant.
Here is an example of the features I use for OCR
In this case (the function size is scaled to NxN ), therefore, each character has 6 arrays of N numbers, such as:
```
 int row_pixels[N]; // 1nd image int lin_pixels[N]; // 2st image int row_y0[N]; // 3th image green int row_y1[N]; // 3th image red int lin_x0[N]; // 4th image green int lin_x1[N]; // 4th image red 
```
Now: pre-compute all functions for each character in your font and for each character read. Find closest match from font
- minimum distance between all object vectors / arrays
- does not exceed some difference between levels.
This is partially invariant with respect to rotation and bends to a point. I use OCR for filled characters, so for selected font it can use some setting

[Note]

For comparison, you can use the distance or correlation coefficient

OCR and similarity of characters

More articles: