What is the time complexity of finding a name in the list R?

Question

What is the time complexity of finding a name in the list R?

I was desperately trying to find the answer through Google and could not. I'm going to do the test myself, but I thought that maybe someone here knows the answer, or at least the link where this is documented.

Expand my question: suppose I have a list L in R of length N , where N quite large (say, 10,000, 100,000, 1 million or more). Suppose my list has names for each item. `

I wonder how long it takes to get one named record, i.e. for execution

  L[[ "any_random_name" ]]

This time, O(N) , i.e. proportional to the length of the list, or it is O(1) , that is, a constant independent of the name of the list. or can it be O( log N ) ?

+5

list time-complexity r lookup

Mateo Dec 27 '16 at 23:04

source share

1 answer

Peter Ellis · Answer 1 · 2016-12-28T20:31:04+0000

Interestingly, the answer is both O(1) and O(n) . Time depends not so much on the length of the list as on the length of the list before the named element is needed.

Define some lists of different lengths:

 library(microbenchmark) makelist <- function(n){ L <- as.list(runif(n)) names(L) <- paste0("Element", 1:n) return(L) } L100 <- makelist(100) L1000 <- makelist(1000) LMillion <- makelist(10^6) L10Million <- makelist(10^7)

If we try to extract an element with the name Element100 ours from each of them - the 100th element - it takes basically the same length of time:

 microbenchmark( L10Million[["Element100"]], LMillion[["Element100"]], L1000[["Element100"]], L100[["Element100"]] ) Unit: nanoseconds expr min lq mean median uq max neval cld L10Million[["Element100"]] 791 791 996.59 792 1186 2766 100 a LMillion[["Element100"]] 791 791 1000.56 989 1186 1976 100 a L1000[["Element100"]] 791 791 1474.64 792 1186 28050 100 a L100[["Element100"]] 791 791 1352.21 792 1186 17779 100 a

But if we try to get the last element, it takes O (n) time

 microbenchmark( L10Million[["Element10000000"]], LMillion[["Element1000000"]], L1000[["Element1000"]], L100[["Element100"]] ) Unit: nanoseconds expr min lq mean median uq max neval cld L10Million[["Element10000000"]] 154965791 165074635 172399030.13 169602043 175170244 235525230 100 c LMillion[["Element1000000"]] 15362773 16540057 17379942.70 16967712 17734922 22361689 100 b L1000[["Element1000"]] 9482 10668 17770.94 16594 20544 67557 100 a L100[["Element100"]] 791 1186 3995.15 3556 6322 24100 100 a library(ggplot2) library(dplyr) res <- data.frame(x = c(100, 1000, 10^6, 10^7), y = c(3556, 16594, 16967715, 169602041)) ggplot(res, aes(x = x, y = y ))+ geom_smooth(method = "lm") + geom_point(, size = 3) + scale_x_log10() + scale_y_log10()

What is the time complexity of finding a name in the list R?

More articles: