Basically the problem mimics the following:
There is an urn with 50 green balls and 50 red balls.
I am allowed to select balls from the urn without replacement with the following rules: for each red ball I lose a dollar, for each selected green ball I get a dollar.
I can stop the collection when I want. In the worst case, I select all 100 and net 0.
The question is to develop an optimal stop strategy and create a program to calculate the expected value of the strategy.
My strategy is to continue collecting the balls, while the expected value of choosing another ball is positive.
That is, the stop rule is DYNAMIC.
In latex, here is a recursive formula in the image:
http://i.stack.imgur.com/fnzYk.jpg
#include <stdio.h> #include <math.h> #include <stdlib.h> double ExpectedValue(double, double); double max(double, double); main() { double g = 50; double r = 50; double EV = ExpectedValue(g, r); printf ("%f\n\n", EV); system("PAUSE"); } double ExpectedValue(double g, double r){ double p = (g / (g + r)); double q = 1 - p; if (g == 0) return r; if (r == 0) return 0; double E_gr = max ((p * ExpectedValue (g - 1, r)) + (q * ExpectedValue (g, r - 1)), (r - g)); return E_gr; } double max(double a, double b){ if (a > b) return a; else return b; }
I skipped it for 30 minutes and it still worked. For small values โโof g and r, the solution is calculated very quickly. What am I doing wrong?
Any help is much appreciated!