Isolation of individual numbers from the image for OCR

Method for easy character recognition for single digits. But this is when the image contains only one digit.

When an image contains several digits, we cannot use the same algorithm, since the entire bitmap is different. How do we process the image to separate it so that we can โ€œmodulateโ€ the OCR operation on each of the individual digits?

+5
source share
2 answers

But what you want to accomplish is the problem of image segmentation, and not the problem of classifying numbers. Just like @VitaliPro said. Both OCR problems are fine, but (in a huge simplification) the first one is "what kind of character is this" and the second one is "how many characters do I have here." You know how to solve the first problem already, see how the second is usually solved.

You want to segment the image into characters (know as โ€œregionsโ€ in the segmentation), and then apply the classification of numbers to each region. One way to do this is to do a Segmentation Watershed , which uses a gradient of colors to highlight edges and areas.

A simple watershed can be done using Python numpy / scipy / skimage, for eaxmple:

#!/usr/bin/env python from PIL import Image import numpy as np from scipy import ndimage from skimage import morphology as morph from skimage.filter import rank def big_regions(lb, tot): l = [] for i in range(1, tot+1): l.append(((i == lb).sum(), i)) l.sort() l.reverse() return l def segment(img, outimg): img = np.array(Image.open(img)) den = rank.median(img, morph.disk(3)) # continuous regions (low gradient) markers = rank.gradient(den, morph.disk(5)) < 10 mrk, tot = ndimage.label(markers) grad = rank.gradient(den, morph.disk(2)) labels = morph.watershed(grad, mrk) print 'Total regions:', tot regs = big_regions(labels, tot) 

There I use the skimage watershed from the morph skimage module.

Most of the time using the watershed, you should place the area on top of the image to get the actual content of the region, which I do not do in the code above. However, this is not required for numbers or most texts, as it is expected to be black and white.

Watershed uses color gradients to identify edges, but filters from such a Canny or Sobel filter can also be used. Please note that I am doing denomination (slight blurring) of the image to prevent detection of very small regions, as these are most likely artifacts or noise. Using Canny or Sobel filters may require additional denaturation steps as the filters lead to clear edges.

Segmentation is used for much more than character separation, it is often used on images to distinguish important regions (i.e. large areas with a very similar look). For example, if I add a few matplotlib tot above and change the segment function, say:

 import matplotlib matplotlib.use('Agg') import matplotlib.pyplot as plt import matplotlib.cm as cm def plot_seg(spr, spc, sps, img, cmap, alpha, xlabel): plt.subplot(spr, spc, sps) plt.imshow(img, cmap=cmap, interpolation='nearest', alpha=alpha) plt.yticks([]) plt.xticks([]) plt.xlabel(xlabel) def plot_mask(spr, spc, sps, reg, lb, regs, cmap, xlabel): masked = np.ma.masked_array(lb, ~(lb == regs[reg][1])) plot_seg(spr, spc, sps, masked, cmap, 1, xlabel) def plot_crop(spr, spc, sps, reg, img, lb, regs, cmap): masked = np.ma.masked_array(img, ~(lb == regs[reg][1])) crop = masked[~np.all(masked == 0, axis=1), :] crop = crop[:, ~np.all(crop == 0, axis=0)] plot_seg(spr, spc, sps, crop, cmap, 1, '%i px' % regs[reg][0]) def segment(img, outimg): img = np.array(Image.open(img)) den = rank.median(img, morph.disk(3)) # continuous regions (low gradient) markers = rank.gradient(den, morph.disk(5)) < 10 mrk, tot = ndimage.label(markers) grad = rank.gradient(den, morph.disk(2)) labels = morph.watershed(grad, mrk) print 'Total regions:', tot regs = big_regions(labels, tot) spr = 3 spc = 6 plot_seg(spr, spc, 1, img, cm.gray, 1, 'image') plot_seg(spr, spc, 2, den, cm.gray, 1, 'denoised') plot_seg(spr, spc, 3, grad, cm.spectral, 1, 'gradient') plot_seg(spr, spc, 4, mrk, cm.spectral, 1, 'markers') plot_seg(spr, spc, 5, labels, cm.spectral, 1, 'regions\n%i' % tot) plot_seg(spr, spc, 6, img, cm.gray, 1, 'composite') plot_seg(spr, spc, 6, labels, cm.spectral, 0.7, 'composite') plot_mask(spr, spc, 7, 0, labels, regs, cm.spectral, 'main region') plot_mask(spr, spc, 8, 1, labels, regs, cm.spectral, '2nd region') plot_mask(spr, spc, 9, 2, labels, regs, cm.spectral, '3rd region') plot_mask(spr, spc, 10, 3, labels, regs, cm.spectral, '4th region') plot_mask(spr, spc, 11, 4, labels, regs, cm.spectral, '5th region') plot_mask(spr, spc, 12, 5, labels, regs, cm.spectral, '6th region') plot_crop(spr, spc, 13, 0, img, labels, regs, cm.gray) plot_crop(spr, spc, 14, 1, img, labels, regs, cm.gray) plot_crop(spr, spc, 15, 2, img, labels, regs, cm.gray) plot_crop(spr, spc, 16, 3, img, labels, regs, cm.gray) plot_crop(spr, spc, 17, 4, img, labels, regs, cm.gray) plot_crop(spr, spc, 18, 5, img, labels, regs, cm.gray) plt.show() 

(This example does not start by itself; you need to add another code example at the top.)

I can make a pretty nice segemntation of any image, for example. result above:

enter image description here

The first line is the steps of the segmentation function, in the second - you have areas, and in the third - areas used as a mask on top of the image.

(PS Yes, the plot code is pretty ugly, but it's easy to understand and change)

+1
source

Follow these steps:

  • Upload image.
  • Select numbers (by searching for contours and applying restrictions on the area and height of letters to avoid false positives). This will split the image and thus do the modular OCR operation that you want to perform.
  • A simple K-nearest neighbor algorithm for performing identification and classification. enter image description here
+2
source

Source: https://habr.com/ru/post/1269590/


All Articles