The time to get elements from the matrix object

As in the case of question , I launched a microobject to read one element from a large matrix. I was surprised to see how much performance degrades when using string names:

m = matrix(1, nrow=1000000, ncol=10) rownames(m) = as.character(1:1000000) microbenchmark(m["3450", 1], m[3450, 1], times=1000) Unit: microseconds expr min lq median uq max neval m["3450", 1] 176465.55 183443.369 185321.5540 185982.0840 522346.477 1000 m[3450, 1] 3.19 3.445 10.7155 14.1545 29.897 1000 

I need to use row names to read my matrix elements. How to improve performance?

UPDATE

I have added Jeffrey test results and subsets (). I don’t know why, but subset () has much better read-only metrics ([[]] allows assignment, a subset () does not work):

  microbenchmark(m["3450", 1], m[["3450", 1]], m[3450, 1], .subset(m, 1)["3450"], .subset(m, 1)[3450], times=1000) Unit: microseconds expr min lq median uq max neval m["3450", 1] 176667.252 180197.435 181969.2900 185090.9155 254075.814 1000 m[["3450", 1]] 144.732 145.341 151.1440 191.9960 1096.183 1000 m[3450, 1] 2.900 3.290 4.4400 6.5025 22.391 1000 .subset(m, 1)["3450"] 2.704 3.140 4.1285 14.8740 43.134 1000 .subset(m, 1)[3450] 2.460 2.815 3.2680 13.0300 38.105 1000 
+4
source share
1 answer

You can use m [["3450, 1]]. The [['operator selects only one item (I believe the first), and then returns it.' ['is used to select more than one element. Ideally, you would not become heroes as pink names in the first place ...

 microbenchmark(m["3450", 1], m[["3450", 1]],m[3450, 1], times=1000) Unit: nanoseconds expr min lq median uq max neval m["3450", 1] 74898303 76755304 78038970 87569666 231740997 1000 m[["3450", 1]] 30790 32657 48673 55671 241340 1000 m[3450, 1] 623 1245 2800 6532 26125 1000 
+3
source

Source: https://habr.com/ru/post/1485606/


All Articles