I know that this can be done with a for loop, but I'm sure data.tablethere is a more elegant solution in the design .
I have two data tables, and I will use 'iris' to illustrate my problem:
library("data.table")
A <- as.data.table(iris) #primary data table
B <- A[Sepal.Width > 3, .N, by = Species] #count from A meeting condition
head(A, 3)
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1: 5.1 3.5 1.4 0.2 setosa
#2: 4.9 3.0 1.4 0.2 setosa
#3: 4.7 3.2 1.3 0.2 setosa
B
# Species N
#1: setosa 42
#2: versicolor 8
#3: virginica 17
I would like to add a new variable to B, which is just the proportion of the dataset that B represents, i.e. for the first row, the result would be something like this:
B[, Proportion := N/nrow(A[Species == "setosa"])]
The RHS of this index should obviously be dynamic, referring to the value of the first column in row B.
This iteration eludes me (although I believe that this is due to the data table key)?); really appreciate any help!
source
share