The variance of Monte Carlo simulations tends to be very large. (They converge to zero as n increases, but very slowly.) You can read about Buffon Needle, in particular here . I would like to draw your attention to the number on which 5 different tracks of 1000 volumes are plotted. Having examined each of them, it will look as if the value is already converging, but to a completely wrong (and different in each case) result.
Good when you start your analysis. The error estimate carried out strictly in the above source is about 5.6 / n. This is the variance, therefore, to get the standard deviation, we need a square root, which makes it even worse. After 1000 volumes, you can expect π to have an error of around 0.073 (although it can be twice as high or much smaller, just by sheer chance). After a million throws, it should be correct to approximately ± 0.002. After a hundred million, you are likely to get the first three digits after the decimal point on the right and the first error on the fourth or (rarely) fifth. I don't know your n, but the values you get look good in that which is typical. Edit: with n = 1,000,000, they are equal to +3 σ and -1 σ.
Note that this is a common misconception that you can get around this by doing several runs and averaging their results (which should fluctuate above and below the correct value with the same probability, in the end). This, of course, works, but does not provide any benefits, except for the obvious that it can be extended to several machines. The error in executing 100 simulations out of a thousand iterations in each case is exactly the same as when starting one with one hundred thousand iterations.
The obvious way to make the rating better is thus increasing the number of hits. One of the great advantages of Monte Carlo methods is that individual samples are usually completely independent of each other and, thus, provide great opportunities for parallelization (OpenMP, vectorized instructions, graphics processor). But the first step is to really try to make your code faster in order to increase n by an order of magnitude or less. For example, since you know from analysis that accuracy will never be truly overwhelming, you can use float instead of double , which is twice as fast on some platforms. Eliminate the sin call (this is of course more elegant if you don't need the π value to start with). Display a summary only every 1000 steps or so.
Less obvious is the analysis doing the right thing and noticing how you can reduce the numerator in the variance (the denominator will be n what you do). If I'm not mistaken, this experiment works best when the length of the needle is equal to the distance between the lines: less than this is suboptimal, and more than this leads to several intersections that you do not want. There may be other ways to improve it by changing the distribution of the angle and / or position.