Create a new variable with a link from another data.table

Question

Create a new variable with a link from another data.table

I know that this can be done with a for loop, but I'm sure data.tablethere is a more elegant solution in the design .

I have two data tables, and I will use 'iris' to illustrate my problem:

library("data.table")
A <- as.data.table(iris)                      #primary data table
B <- A[Sepal.Width > 3, .N, by = Species]     #count from A meeting condition

head(A, 3)
#       Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1:          5.1         3.5          1.4         0.2     setosa
#2:          4.9         3.0          1.4         0.2     setosa
#3:          4.7         3.2          1.3         0.2     setosa

B
#      Species  N
#1:     setosa 42
#2: versicolor  8
#3:  virginica 17

I would like to add a new variable to B, which is just the proportion of the dataset that B represents, i.e. for the first row, the result would be something like this:

B[, Proportion := N/nrow(A[Species == "setosa"])]

The RHS of this index should obviously be dynamic, referring to the value of the first column in row B.

This iteration eludes me (although I believe that this is due to the data table key)?); really appreciate any help!

+4

r data.table

daRknight Jan 6 '16 at 20:30

source share

2

; -)

library("data.table")
A <- as.data.table(iris)                      #primary data table

B <- A[, .(group.count = nrow(.SD[Sepal.Width > 3]), total.count = .N), by = Species]
         [, Proportion := group.count / total.count]

# Just to validate the total counts:
A[, .N, by = Species][]

:

      Species group.count total.count Proportion
1:     setosa          42          50       0.84
2: versicolor           8          50       0.16
3:  virginica          17          50       0.34

:

, ( .SD= "sub data" ), . "" data.table( ) .

.() - data.table - abrev. list() , .

:= (= = ).

+1

R Yoda 06 . '16 22:03

Jaap · Accepted Answer · 2016-01-06T20:45:03+0000

:

A <- as.data.table(iris)
B <- A[Sepal.Width > 3, .N, by = .("spec" = Species)]

B[, Proportion := N/nrow(A[Species == spec]), by = spec]

:

> B
         spec  N Proportion
1:     setosa 42       0.84
2: versicolor  8       0.16
3:  virginica 17       0.34

:

Species spec, R data.table , Proportion.
by = spec , A[Species == spec] spec.

Create a new variable with a link from another data.table

More articles: