Does the learning algorithm determine that I don't care about transparent pixels? Or will it work better if I overlay characters on different backgrounds?
The more βnoiseβ you give to your images on the part of the training data, the more reliable they will be, but yes, the longer it will train. This, however, is where your negative sampels come into action. If you have as many negative learning samples as possible with the maximum possible range, you will create more reliable detectors. If you say, if you have a specific use case, I would suggest tipping your training sets a bit to fit this, it will be less reliable, but much better in your application.
Should I include images where each character is displayed with different prefixes and suffixes, or should I just process each character separately?
If you want to find individual letters, then train individually. If you teach him how to detect βABC,β and you only need βA,β then he will begin to receive mixed messages. Just prepare each letter "A", "B", etc., And then your detector will have to display each individual letter in large images.
Should I include images where the character scales up and down? Am I compiling an algorithm that largely ignores size, and in any case does everything decrease efficiency?
I do not believe that is right. AFAIK HAAR algorithm cannot scale the trained image. Therefore, if you train all your images in 50x50 letters, but the letters in your images are 25x25, you will not find them. If you exercise and discover a different path, you will get results. Start small, let the algorithm resize (up) for you.
source share