Monte Carlo simulation code: generate samples of a given size in R

I started by creating a sample of 500 evenly distributed random numbers between 0 and 1, using the following code:

set.seed(1234) X<-runif(500, min=0, max=1) 

Now I need to write psuedocode, which generates 10,000 N = 500 samples for MC simulation, calculates the average of my newly created X, and stores the iteration number and average in the result object. I have never done this, and so far I have this:

 n.iter <-(10000*500) results <- matrix (0, n.iter, 4) 

Finally, once this is done, I have to run it, then get the median value, average value and min / max of the accumulated sample and save them in a data frame called MC.table. (Also note that above, I have no idea why there is a โ€œ4โ€ in the matrix code --- I worked on the previous examples). Any advice or help would be greatly appreciated.

EDIT: I have an example that might work, but I really don't understand what is happening to it, so please clarify its applicability for this:

 Ni <- 10000 n <- 500 c <- 0 for (i in n){ for (j in 1:Ni){ c <- c+ 1 d <- data.frame (x= , y= ) results [c,1] <- c results [c,2] <- j results [c,3] <- i results [c,4] <- something( d$x, d$y) rm (d) } } 

If you can even take the time to explain what this means, it will help me! Thanks!

+2
source share
3 answers

You can try using data.table , a package that can be installed with install.packages("data.table") . With it installed, you will run something like ...

 > require(data.table) > dt <- data.table(x=runif(500*10000),iter=rep(1:500,each=10000)) # x iter # 1: 0.48293196 1 # 2: 0.61935416 1 # 3: 0.99831614 1 # 4: 0.26944687 1 # 5: 0.38027524 1 # --- # 4999996: 0.11314160 500 # 4999997: 0.07958396 500 # 4999998: 0.97690312 500 # 4999999: 0.81670765 500 # 5000000: 0.62934609 500 > summaries <- dt[,list(mean=mean(x),median=median(x)),by=iter] # iter mean median # 1: 1 0.5005310 0.5026592 # 2: 2 0.4971551 0.4950034 # 3: 3 0.4977677 0.4985360 # 4: 4 0.5034727 0.5052344 # 5: 5 0.4999848 0.4971214 # --- # 496: 496 0.5013314 0.5048186 # 497: 497 0.4955447 0.4941715 # 498: 498 0.4983971 0.4910115 # 499: 499 0.5000382 0.4997024 # 500: 500 0.5009614 0.4988237 > min_o_means <- min(summaries$mean) # [1] 0.4914826 

I think the syntax is pretty simple. Can you see some of the features using ? (e.g. ?rep ). Lines starting with C # simply display the generated objects. In data.tables, the number to the left of : is just the line number, and --- is the lines that are skipped on the display.

+4
source

I think that the answer I would give would really depend on whether you want to learn pseudo-code or want to know how to do it. This answer is what I would recommend for someone who would like to learn how to work with R.

First I would make a matrix with N columns and 10,000 rows. R appreciates this when we make space ahead of time for the numbers we need to enter.

X=matrix(NA,nrow=10000,ncol=500)

You know how to create 500 random variables for one row.

runif(500,0,1)

Now you need to figure out how this happens 10,000 times and assign X to each of them. Maybe the for loop will work.

for(i in 1:10000) X[i,]=runif(500,0,1)

Then you need to figure out how to get a summary of each line. One function that might help is rowMeans() . Look at the help page, and then try to get the average for each row of table X

to get funds for each iteration

rowMeans(X)

to get an idea of โ€‹โ€‹what these numbers are, how could I lean towards

plot(rowMeans(X))

enter image description here

+2
source

I think you are describing a simple bootstrap. In the end, you can use the download function. But until you understand the mechanics, I feel that loops are the way to go. This should start:

 test<-function( seed=1234, sample.size=500, resample.number=1000, alpha=0.05 ) { #initialize original sample original.sample<-runif(sample.size, min=0, max=1) #initialize data.frame resample.results<-data.frame("Run.Number"=NULL,"mean"=NULL) for(counter in 1:resample.number){ temp<-sample(original.sample, size=length(original.sample), replace = TRUE) temp.mean<-mean(temp) temp.table.row<-data.frame("Run.Number"=counter,"mean"=temp.mean) resample.results<-rbind(resample.results,temp.table.row) } resample.results<-resample.results[with(resample.results, order(mean)), ] #for the mean information lowerCI.row<-resample.number*alpha/2 upplerCI.row<-resample.number*(1-(alpha/2)) median.row<-resample.number/2 #for the mean information median<-resample.results$mean[median.row] lowerCI<-resample.results$mean[lowerCI.row] upperCI<-resample.results$mean[upplerCI.row] #for the position information median.run<-resample.results$Run.Number[median.row] lowerCI.run<-resample.results$Run.Number[lowerCI.row] upperCI.run<-resample.results$Run.Number[upplerCI.row] mc.table<-data.frame("median"=NULL,"lowerCI"=NULL,"upperCI"=NULL) values<-data.frame(median,lowerCI,upperCI) #as.numeric because R doesn't like to mix data types runs<-as.numeric(data.frame(median.run,lowerCI.run,upperCI.run)) mc.table<-rbind(mc.table,values) mc.table<-rbind(mc.table,runs) print(mc.table) } 

After resampling your data, you will find the average. Then you order all your re-selected funds. The middle of this list is median. And, for example, with 10,000 re-selections, the 250th custom-made average will be your bottom 95% CI. Although I did not do this here, the min value will only be at position 1, and the maximum value will be at position 10000. Be careful when you decrease the re-fetch number: the way I calculated the positions can become a decimal value that will confuse R.

By the way, I put this in a function form. If you like to make lines line by line, just make sure that all lines between the function () and the following main {}

+2
source

Source: https://habr.com/ru/post/986793/


All Articles