Classify various images

Question

Classify various images

I have a number of images from Chinese genealogies, and I would like to be able to classify them programmatically. Generally speaking, one type of image has mostly text line by line, and the other type may be in grid or chart format.

Sample Photos

"Desired" type: http://www.flickr.com/photos/ 63588871@N05 / 8138563082 /
'Other' type: http://www.flickr.com/photos/ 63588871@N05 / 8138561342 / in / photostream /

Question: Is there a (relatively) easy way to do this? I have experience with Python, but not much information about image processing. Referral to other resources is also assessed.

Thanks!

+4

python image

Devinrb Oct 30 '12 at 15:52

source share

2 answers

AFAIK, there is no easy way to solve this problem. You will need a decent amount of image processing and some basic machine learning to classify these types of images (and may even not be 100% successful).

Another note:

Although this can be solved using only machine learning methods, I would advise you to first start looking for some image processing methods and try to convert your image into a form that has a decent difference for both images. To do this, you better start reading about fft . After that, take a look at some digital image processing techniques. When you feel comfortable that you have a good understanding of this, you can read pattern recognition .

This is only one suggested approach, but there are more ways to achieve this.

0

Minion91 Oct 30 '12 at 15:58

source share

Junuxx · Accepted Answer · 2012-10-30T16:51:22+0000

Assuming that at least some of the grid lines are accurate or almost exactly vertical, a fairly simple approach may work.

I used PIL to find all the columns in the image where more than half the pixels were darker than some threshold value.

code

import Image, ImageDraw # PIL modules withlines = Image.open('withgrid.jpg') nolines = Image.open('nogrid.jpg') def findlines(image): w,h, = image.size s = w*h im = image.point(lambda i: 255 * (i < 60)) # threshold d = im.getdata() # faster than per-pixel operations linecolumns = [] for col in range(w): black = sum( (d[x] for x in range(col, s, w)) )//255 if black > 450: linecolumns += [col] # return an image showing the detected lines im2 = image.convert('RGB') draw = ImageDraw.Draw(im2) for col in linecolumns: draw.line( (col,0,col,h-1), fill='#f00', width = 1) return im2 findlines(withlines).show() findlines(nolines).show()

results

shows detected vertical lines in red to illustrate

enter image description here

As you can see, four grid lines have been detected, and with some processing to ignore the left and right sides and the center of the book, there should be no false positives of the desired type.

This means that you can use the code above to detect black columns, discard those that are near the edge or center. If any black columns remain, classify it as a “different” unwanted class of images.

Classify various images

code

results

More articles: