Reliable function to get substring position in string in Swift

This works well for English:

public static func posOf(needle: String, haystack: String) -> Int { return haystack.distance(from: haystack.startIndex, to: (haystack.range(of: needle)?.lowerBound)!) } 

But for foreign characters, the return value is always too small. For example, "рдХрд╛" is considered a unit instead of 2.

 posOf(needle: "рдХрд╛рдо", haystack: "рд╡рд╣ рдмреАрдирд╛ рдХреА рдЦреБрд▓реА рдХреЛрдпрд▓рд╛ рдЦрджрд╛рди рдореЗрдВ рдХрд╛рдо рдХрд░рддрд╛ рдерд╛ред") // 21 

Later, I use 21 in NSRange(location:length:) , where it needs to do 28 to make NSRange workable.

+5
source share
1 answer

A Swift String is a collection of Character s, and each Character is an "extended Unicode county cluster."

NSString is a set of UTF-16 code blocks.

Example:

 print("рдХрд╛".characters.count) // 1 print(("рдХрд╛" as NSString).length) // 2 

Swift String ranges are represented as Range<String.Index> , and NSString represented as NSRange .

Your function counts the number of Character from the very beginning of the haystack to the beginning of the needle, and this is different from the number of UTF-16 code points.

If you need an "NSRange compatible" number of characters, then the simplest method is to use the range(of:) method of NSString :

 let haystack = "рд╡рд╣ рдмреАрдирд╛ рдХреА рдЦреБрд▓реА рдХреЛрдпрд▓рд╛ рдЦрджрд╛рди рдореЗрдВ рдХрд╛рдо рдХрд░рддрд╛ рдерд╛ред" let needle = "рдХрд╛рдо" if let range = haystack.range(of: needle) { let pos = haystack.distance(from: haystack.startIndex, to: range.lowerBound) print(pos) // 21 } let nsRange = (haystack as NSString).range(of: needle) if nsRange.location != NSNotFound { print(nsRange.location) // 31 } 

Alternatively, use the utf16 representation of the utf16 string to count UTF-16 code units:

 if let range = haystack.range(of: needle) { let lower16 = range.lowerBound.samePosition(in: haystack.utf16) let pos = haystack.utf16.distance(from: haystack.utf16.startIndex, to: lower16) print(pos) // 31 } 

(see, for example, NSRange to Range <String.Index> for additional conversion methods between Range<String.Index> and NSRange ).

+4
source

Source: https://habr.com/ru/post/1261713/


All Articles