Ensemble study, multiple classifier system

I am trying to use the MCS (Multi classifier) โ€‹โ€‹system to improve my work with limited data and then become more accurate.

I am currently using K-mean clustering, but you can go with FCM (Fuzzy c-means) so that the data is grouped into groups (clusters), the data can represent anything, like colors. First, I group the data after pre-processing and normalization and get several different clusters with a large gap between them. Then I continue to use clusters as data for the Bayes classifier, each cluster is a separate color, and the Bayes classifier is trained, and the data from the clusters is then transmitted through separate Bayes classifiers. Each Bayes classifier is trained in only one color. If we accept the color spectrum 3 - 10 as blue, 13 - 20 as red, and the spectrum between 0 - 3 will be white to 1.5, and then gradually change blue to 1.5 - 3 and the same - from blue to red.

What I would like to know is how or what aggregation method (if that's what you would use) can be applied so that the Bayes classifier can become stronger and how does it work? Can the aggregation method already know the answer or is it a human interaction that corrects the results, and then these answers are returned to the Bayes training data? Or a combination of both? Looking at the Bootstrap aggregation, this is due to the fact that each model in the ensemble votes with the same weight, so I'm not entirely sure about this particular case, would I use a bag as my aggregation method? However, reinforcement involves the gradual construction of an ensemble by training each instance of a new model to emphasize examples of training that previous models were incorrectly classified, but I'm not sure that this will be the best alternative to the bag, because I'm not sure how it is gradually based on new ones copies? And the last would be averaging over the Bayesian model, which is an ensemble technique that seeks to approximate the Bayesian optimal classifier by selecting hypotheses from the hypothesis space and combining them using Bayesian law, but completely unsure of how you could select hypotheses from the search space ?

I know that you usually use a competitive rejection approach between two classification algorithms, which say that one says that weighting can be applied, and if it is correct, you will get the best of both classifiers, but for the sake of preservation I do not want a competitive approach .

Another question is to use these two methods together in this way, it would be useful, I know that the example I have given is very primitive and may not be used in this example, but whether it can be useful in more complex data.

+4
source share
1 answer

I have some questions about the method you are executing:

  • K-means puts in each cluster the points that are closest to it. And then you train the classifier using the output. I think that the classifier can surpass the cluster implicit classification, but only taking into account the number of samples in each cluster. For example, if your learning data after clustering is Type A (60%), Type B (20%), Type C (20%); your classifier will prefer to use ambiguous patterns for typeA to get fewer classification errors.
  • K-means depends on what "coordinates" / "functions" you take from objects. If you use functions where objects of different types are mixed, the performance of K-tools will decrease. Removing these functions from a vector function can improve your results.
  • Your "function" / "coordinates" representing the objects you want to classify can be measured in different units. This fact may affect your clustering algorithm, because you implicitly set the conversion of units between them through the clustering error function. The final cluster set is selected with several clustering tests (which were obtained with various cluster initializations) using the error function. Thus, an implicit comparison is performed with different coordinates of your vector object (potentially representing an implicit conversion coefficient).

Given these three points, you are likely to increase the overall performance of your algorithm by adding preprocessing steps. For example, when recognizing objects for computer vision applications, most of the information obtained from images comes only from the edges of the image. All color information and part of the texture information are not used. Borders are subtracted from the image processing image to obtain oriented gradient histogram (HOG) descriptors. This descriptor returns "functions" / "coordinates" that better separate objects, thereby increasing the efficiency of classification (recognition of objects). Theoretically, descriptors provide information contained in the image. However, they present two main advantages: (a) the classifier will process the data with a smaller dimension, and (b) the descriptors calculated from the test data can be more easily compared with the training data.

In your case, I suggest you try to improve your accuracy using a similar approach:

  • Provide richer features to your clustering algorithm
  • Take advantage of previous knowledge in the field to determine which functions to add and remove from your vector function.
  • Always consider the possibility of obtaining tagged data so that controlled learning algorithms can be applied.

Hope this helps ...

+3
source

Source: https://habr.com/ru/post/1399422/


All Articles