String Match Comparison Operator Speed

Question

String Match Comparison Operator Speed

I was curious about the speed of string comparisons in R, when it was time to use != Vs == and I wondered how quickly they reduced.

If I have a vector with two levels, one of which is common and the other is rare (trying to multiply my desired effect).

 x <- sample(c('ALICE', 'HAL90000000000'), replace = TRUE, 1000, prob = c(0.05,0.95))

I would suggest (if there is a contraction) that the operation

x != 'ALICE'

will be much faster than:

x == 'HAL90000000000'

since to check for equality in the latter case, I would suggest that I need to check each character, while the first will be invalidated by either the first or last character (depending on which side the algorithm is checked)

but when I test this, it seems that it is not (it was inconclusive in repeated trials, although with a very slight bias towards operation == faster ?!), or is this not a fair test:

 > microbenchmark(x != 'ALICE', x == 'HAL90000000000') Unit: microseconds expr min lq mean median uq max neval x != "ALICE" 4.520 4.5505 4.61831 4.5775 4.6525 4.970 100 x == "HAL90000000000" 3.766 3.8015 4.00386 3.8425 3.9200 13.766 100

Why is this?

EDIT:

I guess this is because it does a full string match, but if so, is there a way to get R to optimize them? I get no benefit from obfuscating the amount of time it takes to match long or short lines, without worrying about passwords.

+5

r pattern-matching

Shape Oct 13 '16 at 4:16

source share

No one has answered this question yet.

See related questions:

335

How to determine if a string is repeated in Python?

314

How to use reverse or negative wildcards when matching patterns in a unix / linux shell?

161

Speed up your loop in R