Scan and read checkmarked document

I have a request from a client who wants to provide food to older people in different places. To do this, people fill out a weekly form and checkboxes depending on their choice for each day (this also takes into account special requirements).

For instance:

Name Commune With salt ( ) Without salt [] Mon : Meal 1 ( ) Meal 2 ( ) Dessert 1 ( ) Dessert ( ) Tues : Meal 1 ( ) Meal 2 ( ) Dessert 1 ( ) Dessert ( ) 

Then the data of each sheet should be compiled to tell us how much of each type of food each day prepares for each community ...

The sheets are all the same, so I hope they can scan them and read them automatically.

I do not know any software that allows me to do this. What is the best way to solve this problem? I'm looking at tesseract at the moment, but maybe there is a simpler technique?

EDIT: we talk about a few hundred forms per week. ideally, we will scan them at the same time, extract data and store forms electronically.

+4
source share
1 answer

You are not looking for OCR, which means reading presses. You are looking for ICR / OMR software, which is also known as form processing or data collection. OMR stands for Optical Mark Recognition, which you are trying to do, recognize the meaning of the flags / flags.

Further information on handwriting recognition can be found here: ICR for typed machine text?

Since your forms are the same, this means that your forms are categorized as β€œfixed forms,” and a template-based software package can process these forms. Here is a short document explaining the differences between the types of forms: www.wisetrend.com/files/Structured_vs_Semi-Structured.pdf

Your blank form must also be properly designed for machine recognition. It should have check marks for better alignment of the template, a transparent stream so that users know how to fill it naturally, check boxes of the appropriate size, etc.

I believe that FlexiCapture will do everything you need: a link . There are at least a few other solutions that can perform a similar process. I work as an integrator / consultant for paper form processing projects.

I deleted my β€œmobile” tag because I believe that you are not going to use a cell phone to capture these images. If so, I would advise if you have other options. You mentioned scanning them on a regular scanner, which is the best option for achieving good image quality. Believe me, it will be enough for you to deal with the processing of human ink forms in order to optimize your forms, scanning, software and process as much as possible.

If you are interested in developing it, it is possible. The process is to compare the image area (each checkmark) with some β€œbase” to see if there is additional manual recording for this area. If above a certain threshold, then the checkmark has been checked. Typical problems are the alignment of areas and boundary threshold levels (low / light mark). Sales packages handle this automatically.

Please let me know if you require further guidance.

ilya evdokimov

+7
source

Source: https://habr.com/ru/post/1480877/


All Articles