I am new to R, I use it mainly for visualizing statistics using the ggplot2 library. Now I am faced with the problem of preparing data.
I need to write a function that will remove several rows (2, 5 or 10) from a data frame that have the highest and lowest values ββin the specified column and put them in another data frame, and do this for each combination of two factors (in in my case: for every day and server).
So far, I have completed the following steps (MWE using the esoph example esoph ).
I sorted the frame by the desired parameter ( ncontrols in the example):
esoph<-esoph[with(esoph,order(-ncontrols)) ,]
I can display the first / last records for each coefficient value (in this example for each age range):
by(data=esoph,INDICES=esoph$agegp,FUN=head,3) by(data=esoph,INDICES=esoph$agegp,FUN=tail,3)
Basically, I can see the highest and lowest values, but I donβt know how to extract them into another data frame and how to remove them from the main one.
Also in the above example, I can see the upper / lower records for each value of one factor (age range), but in fact I need to know the highest and lowest records for each value of two factors - in this example they can be agegp and alcgp .
I'm not even sure if the above steps are ok - maybe using plyr will work better? I would be grateful for any tips.