Match () vs% in% operator

Question

Match () vs% in% operator

From what I read in ?match()

"% in%" <- function (x, table) match (x, table, nomatch = 0)> 0

Why am I getting a different result with match(x, dict[["word"]], 0L)

 vapply(strsplit(df$text, " "), function(x) sum(dict[["score"]][match(x, dict[["word"]], 0L)]), 1) #[1] 2 -2 3 -2

Unlike using dict[["word"]] %in% x

 vapply(strsplit(df$text, " "), function(x) sum(dict[["score"]][dict[["word"]] %in% x]), 1) #[1] 2 -2 1 -1

Data

 library(dplyr) df <- data_frame(text = c("I love pandas", "I hate monkeys", "pandas pandas pandas", "monkeys monkeys")) dict <- data_frame(word = c("love", "hate", "pandas", "monkeys"), score = c(1,-1,1,-1))

Update

After Richard’s explanation, I now understand my initial fallacy. The %in% operator returns a logical vector:

 > sapply(strsplit(df$text, " "), function(x) dict[["word"]] %in% x) [,1] [,2] [,3] [,4] [1,] TRUE FALSE FALSE FALSE [2,] FALSE TRUE FALSE FALSE [3,] TRUE FALSE TRUE FALSE [4,] FALSE TRUE FALSE TRUE

And match() returns location numbers:

 > sapply(strsplit(df$text, " "), function(x) match(x, dict[["word"]], 0L)) [[1]] [1] 0 1 3 [[2]] [1] 0 2 4 [[3]] [1] 3 3 3 [[4]] [1] 4 4

+6

r match

Steven beaupré Jan 23 '15 at 6:20

source share

1 answer

Rich scriven · Accepted Answer · 2015-01-23T17:57:35+0000

match() returns an integer position vector for the first match, which will be greater than 1 if this position is not the first.

%in% returns a logical vector in which the match (TRUE) is always 1 (if represented as an integer).

Therefore, the amounts in your calculations are likely to be different.

Match () vs% in% operator

More articles: