I know that you wanted to use CharacterSet , not String , but CharacterSet does not support (at least) support for characters consisting of more than one Unicode.Scalar . See the βfamilyβ (π©π©π§π¦) symbol or the international flag symbols (for example, βπ―π΅β or β)β) that Apple demonstrated in the discussion of the line in the WWDC 2017 video What's New in Swift . Multiple skin tones also show this behavior (for example, versus π©π½).
As a result, I would be careful to use CharacterSet (which is a "Unicode character set for use in search operations"). Or, if you want to provide this method for convenience, keep in mind that it will not work correctly with characters represented by multiple unicode scalars.
So, you can offer a scanner that provides both CharacterSet and String of skip method output:
class MyScanner { let string: String var index: String.Index init(_ string: String) { self.string = string index = string.startIndex } var remains: String { return String(string[index...]) }
So your simple example will work:
let scanner = MyScanner("fizz buzz fizz") scanner.skip(charactersIn: CharacterSet.alphanumerics) scanner.skip(charactersIn: CharacterSet.whitespaces) print(scanner.remains) // "buzz fizz"
But use the String command if the characters you want to skip can include multiple unicode scanners:
let family = "π©\u{200D}π©\u{200D}π§\u{200D}π¦" // π©βπ©βπ§βπ¦ let boy = "π¦" let charactersToSkip = family + boy let string = boy + family + "foobar" // π¦π©βπ©βπ§βπ¦foobar let scanner = MyScanner(string) scanner.skip(charactersIn: charactersToSkip) print(scanner.remains) // foobar
As Michael Waterfall noted in the comments below, CharacterSet has an error and does not even process 32-bit Unicode.Scalar values ββcorrectly, which means that it does not even process individual scalar characters properly if the value exceeds 0xffff (including emoji, among others). However, String execution usually handles them correctly.
source share