Preparation (added tolowerhere):
txt1 <- c('The quick brown fox jumps over the lazy dog')
txt2 <- c('Te quick foks jump ovar lazzy dogg')
words <- unlist(strsplit(tolower(as.character(txt1)), " "))
words.p <- unlist(strsplit(tolower(as.character(txt2)), " "))
Get distance for each word:
dists <- sapply(words, Map, f=stringdist, list(words.p), method="jaccard")
For each word in, wordsfind the closest word from words.p:
matches <- words.p[sapply(dists, which.min)]
cbind(words, matches)
matches
[1,] "the" "te"
[2,] "quick" "quick"
[3,] "brown" "ovar"
[4,] "fox" "foks"
[5,] "jumps" "jump"
[6,] "over" "ovar"
[7,] "the" "te"
[8,] "lazy" "lazzy"
[9,] "dog" "dogg"
EDIT:
To get the best matching pair of words, you must first select the minimum distance from each word in wordsto all words in words.p:
mindists <- sapply(dists, min)
. words :
words[which.min(mindists)]
:
words[which.min(sapply(dists, min))]