How to find the number of identical elements in two vectors?

I have two vectors:

 a <- letters[1:5]
 b <- c('a','k','w','p','b','b')

Now I want to count how many times each letter in the vector aappears in b. I want to receive:

 # 1  2  0  0  0

What should I do?

+4
source share
3 answers

tabulateworks with whole vectors and works fast; match your letters with universes of possible letters, and then insert an index into the table; use length(a)to make sure there is one counter for each possible value.

> tabulate(match(b, a), length(a))
 [1] 1 2 0 0 0

This is faster than the "obvious" solution to table ()

library(microbenchmark)
f0 = function() table(factor(b,levels=a))
f1 = function() tabulate(match(b, a), length(a))

and then

> microbenchmark(f0(), f1())
Unit: microseconds
 expr     min       lq  median       uq     max neval
 f0() 566.824 576.2985 582.950 594.4200 798.275   100
 f1()  56.816  60.0180  63.305  65.4185 120.441   100

but also more general, for example, matching numeric values without resorting to string representation.

+5
source

b , a. , a, <NA>. , ( useNA="ifany").

table(factor(b,levels=a))

a b c d e 
1 2 0 0 0 
+4
>sapply(a, function(x) sum(x==b))

a b c d e 
1 2 0 0 0 

An alternative solution. The anonymous function can be changed to implement matching fuzzy names with a package, for examplestringdist

+2
source

Source: https://habr.com/ru/post/1544952/


All Articles