Just stumbled upon a .do file, which I need to translate to R , because I don't have a Stata license; my Stata is rusty, so can someone confirm that the code is doing what I think?
For reproducibility, I am going to translate it into a dataset that I found on the Internet, in particular, a dairy production dataset (p004) , which is part of a textbook from Chatterjee, Hadi and Price.
Here's the Stata code:
collapse (min) min_protein = protein /// (mean) avg_protein = protein /// (median) median_protein = protein /// (sd) sd_protein = protein /// if protein > 2.8, by(lactatio)
Here is what I think it does in the data.table syntax:
library(data.table) library(foreign) DT = read.dta("p004.dta") setDT(DT) DT[protein > 2.8, .(min_protein = min(protein), avg_protein = mean(protein), median_protein = median(protein), sd_protein = sd(protein)), keyby = lactatio]
Is it correct?
This would be easy to confirm if I had used Stata in the last 18 months or if I had a copy installed - hoping I could bend the ear of someone for whom this is true. Thanks.
source share