Reconstruction and character filling for OCR

Question

Reconstruction and character filling for OCR

I work with text recognition on tires. To use OCR, I must first get a clear binary map.

I processed the images and the text appears with broken and discontinued edges. I tried standard erosion / expansion with circular discs and line element in MATLAB, but that really doesn't help.

Pr1- Any ideas on how to restore these characters and fill the gap between character strokes?

Original Image_highres Original Image_lowRes canny edge detected

Pr2 - Images above have higher resolution and in good light. However, if the lighting is poor and the resolution is relatively low, as in the image below, what will be the viable processing options?

Solutions:

S1: This is the result of applying a median filter to the processed image shared by Spektre. To remove noise, I applied a median filter (5x5) and then expanding the image using a linear element (5, 11). Even now, OCR (Matlab 2014b) can only recognize certain characters

In any case, many thanks for the suggestions. I will still wait if someone can offer something else, perhaps thinking out of the box :).

Results of the Matlab steps from the Spectrum code below (without stroke) (normalization with angles in the order of 1,2,3,4:

and with the threshold tr0 = 400 and tr1 = 180 and the angular order to normalize 1,3,2,4

Best wishes

Wajahat

+4

image-processing matlab ocr edge-detection

Wajahat Jul 16 '15 at 11:56

source share

2 answers

You can first apply max-filter (assign each pixel in the new image the maximum value from the neighborhood around the same pixel in the original image), then min-filter (assign the minimum from the neighborhood to max-image). Especially if you build the neighborhood a little wider than tall (say, 2 or 3 pixels left / right, 1 pixel top / bottom), you can get some of your characters (your image usually displays spaces in the horizontal direction).

The optimal size and shape of the neighborhood depends on your specific problem, so you will have to experiment. You may encounter bonding of characters along with this operation - you may have to detect drops and break them if they are too wide compared to other blocks.

edit: In addition, binarization settings are absolutely key. Try several different binarization algorithms (Otsu, Sauvola, ...) to see which one (and which parameters) is best for you.

+2

Daniel Jul 16 '15 at 12:36

source share

Spektre · Accepted Answer · 2015-07-22T09:12:22+0000

I played a little with your input

Normalization of lighting + normalization of the dynamic range helps to slightly improve the results, but still far from the desired one. I would like to try to get around partial derivations in order to pick up letters from the background and tear out small strokes before integrating back and repainting to mask the image when I have time (not sure when maybe tomorrowow). I will edit this (and comment / notify you)

normalized lighting

calculate the average intensity of the angles and bilinearly scale the intensities to match the average color

edge detection

partial derivation of intensity i on x and y ...

i=|i(x,y)/dx|+|i(x,y)/dy|

and then tresholded to treshold=13

[notes]

To eliminate most of the noise, I applied smooth filtering until edge detection

[edit1] after some analysis, I found that your image has bad edges for sharpening integration

Here is an example of a graph of intensity after the first output on x in the middle line of the image

As you can see, black areas are good, but white ones are almost not recognized from background noise. Therefore, your only hope is to use minimal filtering as the recommended @Daniel answer and more weight in the areas of the black edge (white ones are not reliable)

The min max filter emphasizes the black (blue mask) and white (red mask) areas. If the areas of the stand were reliable, you just filled the space between them, but this is not an option in your case, instead I would increase the areas (with a lot of weight on the blue mask) and OCR - the result using the OCR configured for this 3 color input.

You can create your own OCR for this to see OCR and character similarities.

You can also take 2 images with different positions of light and a fixed camera and combine them to cover a recognizable black area on all sides.

[edit2] C ++ source code for the last method

 //--------------------------------------------------------------------------- typedef union { int dd; short int dw[2]; byte db[4]; } color; picture pic0,pic1,pic2; // pic0 source image,pic1 normalized+min/max,pic2 enlarge filter //--------------------------------------------------------------------------- void filter() { int sz=16; // [pixels] square size for corner avg color computation (c00..c11) int fs0=5; // blue [pixels] font thickness int fs1=2; // red [pixels] font thickness int tr0=320; // blue min treshold int tr1=125; // red max treshold int x,y,c,cavg,cmin,cmax; pic1=pic0; // copy source image pic1.rgb2i(); // convert to grayscale intensity for (x=0;x<5;x++) pic1.ui_smooth(); cavg=pic1.ui_normalize(); // min max filter cmin=pic1.p[0][0].dd; cmax=cmin; for (y=0;y<pic1.ys;y++) for (x=0;x<pic1.xs;x++) { c=pic1.p[y][x].dd; if (cmin>c) cmin=c; if (cmax<c) cmax=c; } // treshold min/max for (y=0;y<pic1.ys;y++) for (x=0;x<pic1.xs;x++) { c=pic1.p[y][x].dd; if (cmax-c<tr1) c=0x00FF0000; // red else if (c-cmin<tr0) c=0x000000FF; // blue else c=0x00000000; // black pic1.p[y][x].dd=c; } pic1.rgb_smooth(); // remove single dots // recolor image pic2=pic1; pic2.clear(0); pic2.bmp->Canvas->Pen ->Color=clWhite; pic2.bmp->Canvas->Brush->Color=clWhite; for (y=0;y<pic1.ys;y++) for (x=0;x<pic1.xs;x++) { c=pic1.p[y][x].dd; if (c==0x00FF0000) { pic2.bmp->Canvas->Pen ->Color=clRed; pic2.bmp->Canvas->Brush->Color=clRed; pic2.bmp->Canvas->Ellipse(x-fs1,y-fs1,x+fs1,y+fs1); // red } if (c==0x000000FF) { pic2.bmp->Canvas->Pen ->Color=clBlue; pic2.bmp->Canvas->Brush->Color=clBlue; pic2.bmp->Canvas->Ellipse(x-fs0,y-fs0,x+fs0,y+fs0); // blue } } } //--------------------------------------------------------------------------- int picture::ui_normalize(int sz=32) { if (xs<sz) return 0; if (ys<sz) return 0; int x,y,c,c0,c1,c00,c01,c10,c11,cavg; // compute average intensity in corners for (c00=0,y= 0;y< sz;y++) for (x= 0;x< sz;x++) c00+=p[y][x].dd; c00/=sz*sz; for (c01=0,y= 0;y< sz;y++) for (x=xs-sz;x<xs;x++) c01+=p[y][x].dd; c01/=sz*sz; for (c10=0,y=ys-sz;y<ys;y++) for (x= 0;x< sz;x++) c10+=p[y][x].dd; c10/=sz*sz; for (c11=0,y=ys-sz;y<ys;y++) for (x=xs-sz;x<xs;x++) c11+=p[y][x].dd; c11/=sz*sz; cavg=(c00+c01+c10+c11)/4; // normalize lighting conditions for (y=0;y<ys;y++) for (x=0;x<xs;x++) { // avg color = bilinear interpolation of corners colors c0=c00+(((c01-c00)*x)/xs); c1=c10+(((c11-c10)*x)/xs); c =c0 +(((c1 -c0 )*y)/ys); // scale to avg color if (c) p[y][x].dd=(p[y][x].dd*cavg)/c; } // compute min max intensities for (c0=0,c1=0,y=0;y<ys;y++) for (x=0;x<xs;x++) { c=p[y][x].dd; if (c0>c) c0=c; if (c1<c) c1=c; } // maximize dynamic range <0,765> for (y=0;y<ys;y++) for (x=0;x<xs;x++) c=((p[y][x].dd-c0)*765)/(c1-c0); return cavg; } //--------------------------------------------------------------------------- void picture::rgb_smooth() { color *q0,*q1; int x,y,i; color c0,c1,c2; if ((xs<2)||(ys<2)) return; for (y=0;y<ys-1;y++) { q0=p[y ]; q1=p[y+1]; for (x=0;x<xs-1;x++) { c0=q0[x]; c1=q0[x+1]; c2=q1[x]; for (i=0;i<4;i++) q0[x].db[i]=WORD((WORD(c0.db[i])+WORD(c0.db[i])+WORD(c1.db[i])+WORD(c2.db[i]))>>2); } } } //---------------------------------------------------------------------------

I use my own image class for images, so some members:

xs,ys image size in pixels
p[y][x].dd - pixel at position (x,y) as a 32-bit integer type
clear(color) - clears the entire image
resize(xs,ys) - resize the image to a new resolution
bmp - VCL encapsulated GDI Bitmap with canvas access

I added the source for only two relevant member functions (no need to copy the whole class here)

[edit3] LQ Image

The best setting I found (the code is the same):

 int sz=32; // [pixels] square size for corner avg color computation (c00..c11) int fs0=2; // blue [pixels] font thickness int fs1=2; // red [pixels] font thickness int tr0=52; // blue min treshold int tr1=0; // red max treshold

Due to lighting conditions, the red area is unusable (off)

Reconstruction and character filling for OCR

Solutions:

More articles: