What is the machine learning algorithm for this simple optimization?

I will formulate a simple problem that I would like to solve using machine learning (in R or similar platforms): my algorithm takes 3 parameters (a, b, c) and returns an estimate of s in the range [0,1]. The parameters are all categorical: a has 3 options, b has 4, and c has 10. Therefore, my data set has 3 * 4 * 10 = 120 cases. High scores are desirable (close to 1), low scores are not (close to 0). Let's consider the algorithm as a black box, taking a, b, c and returning s.

The data set is as follows:

a, b, c, s ------------------ a1, b1, c1, 0.223 a1, b1, c2, 0.454 ... 

If I draw the density s for each parameter, I get very wide distributions in which some cases work very well (s> .8), others poorly (s <.2).

If I look at cases when s is very high, I do not see a clear picture. Values ​​of parameters that generally work poorly can work very well in combination with certain parameters and vice versa.

To measure how well a particular value is fulfilled (e.g. a1), I calculate the median:

 median( mydataset[ a == a1]$s ) 

For example, median (a1) = .5, median (b3) =. 9, but when I combine them, I get a lower result s (a_1, b_3) = .3. On the other hand, the median is (a2) =. 3, median (b1) =. 4, but s (a2, b1) = .7.

Given that there are no parameter values ​​that always work well, I think I should look for combinations (of 2 parameters) that seem to work well together in a statistically significant way (i.e. excluding outliers that have very high scores ) In other words, I want to get a policy for choosing the optimal parameter, for example. the most effective are combinations (a1, b3), (a2, b1), etc.

Now, I think this is an optimization problem that can be solved with machine learning.

What standard methods would you recommend in this context?

EDIT: Someone suggested a linear programming solution with glpk , but I don't understand how to apply linear programming to this problem.

+4
source share
1 answer

The most standard method for this is linear regression. You can predict the value for certain parameters; more generally - to get a function that according to your 3 parameters gives the maximum value

+1
source

Source: https://habr.com/ru/post/1402696/


All Articles