Creating Good Learning Data for Hara Cascades

I am trying to create haar cascades to run an OCR of a specific font; one classifier per character.

I can generate tons of training data by simply drawing a font on the images. Thus, the plan is to generate positive learning data for each character and use examples of other characters as negative learning data.

I wonder how many variations I should make to the training data. Usually I’m just trying everything, but I’m going to have these things take a few days to train (for each character!), So some tips would be good.

So a few questions:

  • Does the learning algorithm determine that I don't care about transparent pixels? Or will it work better if I overlay characters on different backgrounds?
  • Should I include images where each character is displayed with different prefixes and suffixes, or should I just process each character separately?
  • Should I include images where the character scales up and down? I am compiling an algorithm that largely ignores size, and in any case, everything reduces efficiency.

Thanks!

+6
source share
1 answer

Does the learning algorithm determine that I don't care about transparent pixels? Or will it work better if I overlay characters on different backgrounds?

The more β€œnoise” you give to your images on the part of the training data, the more reliable they will be, but yes, the longer it will train. This, however, is where your negative sampels come into action. If you have as many negative learning samples as possible with the maximum possible range, you will create more reliable detectors. If you say, if you have a specific use case, I would suggest tipping your training sets a bit to fit this, it will be less reliable, but much better in your application.

Should I include images where each character is displayed with different prefixes and suffixes, or should I just process each character separately?

If you want to find individual letters, then train individually. If you teach him how to detect β€œABC,” and you only need β€œA,” then he will begin to receive mixed messages. Just prepare each letter "A", "B", etc., And then your detector will have to display each individual letter in large images.

Should I include images where the character scales up and down? Am I compiling an algorithm that largely ignores size, and in any case does everything decrease efficiency?

I do not believe that is right. AFAIK HAAR algorithm cannot scale the trained image. Therefore, if you train all your images in 50x50 letters, but the letters in your images are 25x25, you will not find them. If you exercise and discover a different path, you will get results. Start small, let the algorithm resize (up) for you.

+8
source

Source: https://habr.com/ru/post/983725/


All Articles