Assign a matrix to a subset of data.table

Question

Assign a matrix to a subset of data.table

I would like to assign a multi-column subset of data.table to the matrix, but the matrix ends up being treated as a column vector. For example,

 dt1 <- data.table(a1=rnorm(5), a2=rnorm(5), a3=rnorm(5)) m1 <- matrix(rnorm(10), ncol=2) dt1[,c("a1","a2")] <- m1 Warning messages: 1: In `[<-.data.table`(`*tmp*`, , c("a1", "a2"), value = c(-0.308851784175091, : 2 column matrix RHS of := will be treated as one vector 2: In `[<-.data.table`(`*tmp*`, , c("a1", "a2"), value = c(-0.308851784175091, : Supplied 10 items to be assigned to 5 items of column 'a1' (5 unused) 3: In `[<-.data.table`(`*tmp*`, , c("a1", "a2"), value = c(-0.308851784175091, : 2 column matrix RHS of := will be treated as one vector 4: In `[<-.data.table`(`*tmp*`, , c("a1", "a2"), value = c(-0.308851784175091, : Supplied 10 items to be assigned to 5 items of column 'a2' (5 unused)

The problem can be solved by first converting m1 to another data.table object, but I'm curious what are the reasons for this error. The above syntax will work if dt1 was data.frame ; What is the architectural rationale for not working with data.table ?

+5

r data.table

Abiel Nov 12 '13 at 0:51

source share

2 answers

 dt1[,c("a1","a2")] <- as.data.table(m1)

gives a simple solution, but makes a copy.

@Simon O'Hanlon offers a solution in the form of data.table :

 dt1[ , ':='( a1 = m1[,1] , a2 = m1[,2] ) ]

and, in my opinion, data.table offers an even better solution for data.table:

 dt1[,c("a1","a2") := as.data.table(m1)]

+2

caranbot Jun 26 '19 at 14:03

source share

mnel · Accepted Answer · 2013-11-12 02:08

A data.frame not matrix , and is not data.table a matrix . Both data.frame and data.table are lists . They are stored in very different ways, although indexing may be similar, it is handled under the hood.

Inside [<-.data.frame value matrix is split into a list with an element for each column.

(Line value <- split(value, col(value)) )).

Note that [<-.data.frame will copy the entire data file in the process of assigning a subset of columns.

data.table tries to avoid this copy, so avoid [<-.data.table , since all the <- methods in R make copies.

Inside [<-.data.table , [<-.data.frame will be called if i is a matrix, but not just value .

data.table usually likes you being explicit in ensuring that the data types are consistent in the assignment. This helps to avoid any coercion and corresponding copying.

You can perhaps add functions here to provide compatibility, but given that your use is far beyond what is recommended, package authors may ask you to simply use the conventions and approaches of data.table .

Assign a matrix to a subset of data.table

More articles: