Using HMM for offline character recognition

Question

Using HMM for offline character recognition

I extracted functions from many images of isolated characters (such as gradient, neighboring pixel weight and geometric properties. How can I use HMM as a classifier trained in this data? All the literature that I read about HMM relates to states and state transitions but I don’t I can connect it to functions and class labels. An example on the JAHMM homepage is not related to my problem. I need to use HMM not because it will work better than other approaches to this problem, but because of limitations on the project topic.

An answer to this question was received for online recognition, but I want it to be inactive and in a little more detail

EDIT: I split each character into a grid with a fixed number of squares. Now I plan to perform feature extraction on each block of the grid and, thus, get a sequence of features for each sample, moving from left to right and from top to bottom.

Will this constitute an adequate “sequence” for HMM, i.e. Can HMM guess the time variation of the data, even if the character is not drawn from left to right and from top to bottom? If you do not offer an alternative way.
Should I serve a lot of functions or start with a few? How do I know if HMM is underdeveloped, or if functions are bad? I am using JAHMM.
Extraction of stroke features is difficult and cannot be logically combined with grid functions? (since the HMM expects a sequence generated by some random process)

+6

ocr classification hidden-markov-models

Bug killer Nov 02 '13 at 22:27

source share

1 answer

Throwback1986 · Accepted Answer · 2013-11-14T23:15:38+0000

I usually saw neural networks used for this kind of recognition tasks, i.e. here , here, here , and here , since a simple google search causes so many hits for neural networks in OCR, I assume you are set to use HMM (a design constraint, right?) Regardless, these links may offer some insight into image grid and image functions.

Your approach to turning the grid into a sequence of observations is reasonable. In this case, make sure that you are not confusing observations and states. The functions that you extract from one block must be collected in one observation, i.e. Vector signs. (Compared to speech recognition, your object function vector is similar to the feature vector associated with a speech phoneme.) You really don't have much information about the underlying states. This is a hidden aspect of HMM, and the learning process should inform the model of how likely it is that one function vector should follow another for the symbol (i.e., Transition Probabilities).

Since this is an autonomous process, do not worry about the temporal aspects of how the characters are actually drawn. For the purpose of your task, you have superimposed a temporary order of the sequence of observations using the sequence from left and right, from top to bottom. This should work fine.

As for HMM performance: choose a reasonable vector of key features. In recog speech, the dimension of a vector function can be high (> 10). (The cited literature may also help.) Set aside a percentage of the training data so that you can test the model correctly. First prepare the model, and then evaluate the model in the training kit. How well are your characters classified? If it does not work well, overestimate the function vector. If it works well with test data, check the commonality of the classifier by running it in the reserved test data.

As for the number of states, I would start with something heuristic. Assuming your character images scale and normalize, maybe something like 40% (?) Of the blocks are occupied? This is a rough assumption on my part, since the original image was not provided. For an 8x8 grid, this will mean that 25 blocks are occupied. Then we could start with 25 states, but perhaps naive: empty blocks can transmit information (which means that the number of states can increase), but some sets of signs can be observed in similar states (this means that the number of states can decrease. ) If it were me, I would choose something like 20 states. Having said that: be careful not to confuse features and states. Your vector function is a representation of things observed in a particular state. If the tests described above show that your model is not working well, adjust the number of states up or down and try again.

Good luck.

Using HMM for offline character recognition

More articles: