Open source for an OCR real-time image processing application?

I have an application that I want to create. We have wall schedules that are divided into small rectangles using black lines on a white background. Magnetic name tags are placed in a specific section to indicate that this person should work in this cell. This system works very well for communication between people, but I would like to automatically automate the storage of this schedule information in the database automatically.

I will introduce a system in which the camera is set to the locked position, focusing on the schedule board. Periodically, the camera takes a picture of the board. I want to write code to decrypt name tags in which area. This will require some character recognition or character recognition. Each name tag has large numbers that I will use to identify the person whose name tag is.

Naturally, I go to Python to solve a new programming problem. I found this post -> python image recognition , which looks like a good place to run (with PIL and numpy).

Do you know a good way to do this?

Update: I tried SimpleCV, and so far it looks good.

+6
source share
3 answers

This is actually a rather complex problem, although it looks pretty simple. But you can make it a lot simpler by doing some things for your image to make it manageable. I have the following suggestions:

  • Try to make your camera look directly at the board with a reasonable lens so that there is minimal image distortion at the edges and no perspective distortion.
  • Given that you will shoot a random image for analysis, I think that performance is by no means a problem, so shoot high-resolution images with a flash or with a long exposure time (because everything you shoot is still) to get the best image quality.
  • If the number of different tags that you expect is not too large, it might be easier for you to simply try to match the reference images of these tags in your image using template matching rather than looking for the full number of OCR characters. It is much easier to get if your image is good enough. python opencv interface is very complete.
  • High Performance Mark has a good comment on your question about including barcodes in tags. I would add the QR codes option, but this is the same. Both are easy to spot, and there are good libraries to help you read them.
  • If you decide that you need OCR, you should study the available OCR packages and not try to collapse your own. Try pytesser for tesseract mechanism or OCRopus python interface .
+4
source

Since you mentioned that you want to use Python for this problem, perhaps you can take a look at SimpleCV . It will provide you with an easy way to capture a camera image and perform basic image processing.

+2
source

I totally agree with jilles de witt that OCR will be an extremely difficult task for image analysis to evolve from scratch. Reading code would be a better option, but it would also be difficult to program and would require complex or somewhat complex visualization, as others have noted. However, for this application you really do not need to implement OCR or formal barcodes, QR or other 2d codes.

Since your application is limited to a limited number of goals, perhaps you can create your own simple code. For example, you can place from 0 to 4 large points in a 2x2 array after each person’s name. This simple sample code uniquely identifies 16 unique tags, and functions will be much simpler in image, retrieval and decoding than in formal codes. Add a locator line if the code position does not match.

0
source

Source: https://habr.com/ru/post/908720/


All Articles