Thanks to smirkingman for a nice code post. Here is the optimized version.
1) Use Asc (Mid $ (s1, i, 1) instead. Numerical comparison is usually faster than text.
2) Use Mid $ istead of Mid, as this is later a variant of ver. and adding $ is the string ver.
3) Use the app function for min. (personal preference only)
4) Use Long instead of integers since it uses excel.
Function Levenshtein(ByVal string1 As String, ByVal string2 As String) As Long Dim i As Long, j As Long Dim string1_length As Long Dim string2_length As Long Dim distance() As Long string1_length = Len(string1) string2_length = Len(string2) ReDim distance(string1_length, string2_length) For i = 0 To string1_length distance(i, 0) = i Next For j = 0 To string2_length distance(0, j) = j Next For i = 1 To string1_length For j = 1 To string2_length If Asc(Mid$(string1, i, 1)) = Asc(Mid$(string2, j, 1)) Then distance(i, j) = distance(i - 1, j - 1) Else distance(i, j) = Application.WorksheetFunction.Min _ (distance(i - 1, j) + 1, _ distance(i, j - 1) + 1, _ distance(i - 1, j - 1) + 1) End If Next Next Levenshtein = distance(string1_length, string2_length) End Function
UPDATE
For those who want this: I think itβs safe to say that most people use the Levenshtein distance to calculate fuzzy matches. Here is a way to do this, and I added an optimization that you can specify min. match% to return (default 70% +. You enter percentages such as β50β or β80β, or β0β to run the formula independently).
The speedup is due to the fact that the function checks whether this is possible in the percentage that you give by checking the length of two lines. Please note that there are some areas where this function can be optimized, but I saved it for this purpose for readability. I concatenated the distance as a result to prove functionality, but you can play with it :)
Function FuzzyMatch(ByVal string1 As String, _ ByVal string2 As String, _ Optional min_percentage As Long = 70) As String Dim i As Long, j As Long Dim string1_length As Long Dim string2_length As Long Dim distance() As Long, result As Long string1_length = Len(string1) string2_length = Len(string2) ' Check if not too long If string1_length >= string2_length * (min_percentage / 100) Then ' Check if not too short If string1_length <= string2_length * ((200 - min_percentage) / 100) Then ReDim distance(string1_length, string2_length) For i = 0 To string1_length: distance(i, 0) = i: Next For j = 0 To string2_length: distance(0, j) = j: Next For i = 1 To string1_length For j = 1 To string2_length If Asc(Mid$(string1, i, 1)) = Asc(Mid$(string2, j, 1)) Then distance(i, j) = distance(i - 1, j - 1) Else distance(i, j) = Application.WorksheetFunction.Min _ (distance(i - 1, j) + 1, _ distance(i, j - 1) + 1, _ distance(i - 1, j - 1) + 1) End If Next Next result = distance(string1_length, string2_length) 'The distance End If End If If result <> 0 Then FuzzyMatch = (CLng((100 - ((result / string1_length) * 100)))) & _ "% (" & result & ")" 'Convert to percentage Else FuzzyMatch = "Not a match" End If End Function