This is not an area in which I am very familer, but assuming you cannot use OCR (because your text is illegible or something), I would (probably naively) try something like:
- load image data into memory
- splitting pixel data into image lines
- find each βlineβ that has only white pixels: notice them as βwhite linesβ
- For each column in each white row, try to find white spaces.
- take all your new x, y coordinates and crop the image.
It actually sounded like a fun exercise, so I gave it a project with pyPNG :
import png import sys KERNING = 3 def find_rows(pixels,width, height): "find all rows that are purely white" white_rows = [] is_white = False for y in range(height): if sum(sum( pixels[(y*4*width)+x*4+p] for p in range(3)) for x in range(width)) >= width*3*254: if not is_white: white_rows.append(y) is_white = True else: is_white = False return white_rows def find_words_in_image(blob, tolerance=30): n = 0 r = png.Reader(bytes=blob) (width,height,pixels_rows,meta) = r.asRGBA8() pixels = [] for row in pixels_rows: for px in row: pixels.append(px)
source share