NSAttributedString and emojis: problem with positions and lengths

I color some parts of the text coming from the API (think "@mention", like on Twitter) using NSAttributedString.

The API gives me text and an array of objects representing parts of the text that are mentioned (or links, tags, etc.) that need to be colored.

But sometimes the coloring is shifted due to emojis.


For example, using this text:

"@ericd Some text. @apero"

API gives:

[{"text": "ericd", "len": 6, "pos": 0}, {"text": "apero", "len": 6, "pos": 18}]

which I successfully translated to NSAttributedString using NSRange:

for m in entities.mentions { let r = NSMakeRange(m.pos, m.len) myAttributedString.addAttribute(NSForegroundColorAttributeName, value: someValue, range: r) } 

We see that "pos": 18 true, this is where "@apero" begins. The colored parts of "@ericd" and "@apero" are as expected.

but when some specific emojis combinations are used in the text, the API does not tolerate NSATtributedString well, the color is biased :

"@ericd Some texts. 😺✌🏻 @apero"

gives:

[{"text": "ericd", "len": 6, "pos": 0}, {"text": "apero", "len": 6, "pos": 22}]

"pos": 22 : The author of the API claims that this is correct, and I understand their point of view.

Unfortunately NSAttributedString does not agree, my coloring is disabled:

enter image description here

The last characters for the second mention are not colored (because "pos" is too short due to emosis?).

As you may have guessed, I can’t change the way the API behaves, I need to adapt it on the client side.

Besides ... I have no idea what to do. Should I try to determine what emotions are in the text and manually change the position of the references when there are problematic emojis? But what would be the criteria for determining which emoji shifts the position and what does not? And how to decide how much bias I need? Maybe the problem is caused by NSAttributedString?

I understand that this is due to the length of emojis, once compiled in comparison with their length in the form of discrete characters, but ... well ... I got lost (sigh).


Please note that I tried to implement a solution similar to this stuff , because my API is compatible with this, but it worked only partially, some emojis still broke indexes:

enter image description here

+6
source share
1 answer

A Swift String provides various β€œviews” for its contents. For a good overview, see β€œSwift 2 Lines” on the Swift Blog:

  • characters is a collection of character values ​​or extended grapheme clusters.
  • unicodeScalars is a collection of Unicode scan values.
  • utf8 is a set of UTF-8 code blocks.
  • utf16 is a set of UTF-16 code blocks.

As it turned out in the discussion, pos and len from your API are indices in the Unicode view of scanners.

On the other hand, the addAttribute() NSMutableAttributedString takes the value NSRange , that is, the range corresponding to the UTF-16 code point indices in NSString .

String provides methods for "translating" different views between indexes (compare NSRange to Range <String.Index> ):

 let text = "@ericd Some text. 😺✌🏻 @apero" let pos = 22 let len = 6 // Compute String.UnicodeScalarView indices for first and last position: let from32 = text.unicodeScalars.index(text.unicodeScalars.startIndex, offsetBy: pos) let to32 = text.unicodeScalars.index(from32, offsetBy: len) // Convert to String.UTF16View indices: let from16 = from32.samePosition(in: text.utf16) let to16 = to32.samePosition(in: text.utf16) // Convert to NSRange by computing the integer distances: let nsRange = NSRange(location: text.utf16.distance(from: text.utf16.startIndex, to: from16), length: text.utf16.distance(from: from16, to: to16)) 

This NSRange is what you need for the attribute string:

 let attrString = NSMutableAttributedString(string: text) attrString.addAttribute(NSForegroundColorAttributeName, value: UIColor.red, range: nsRange) 

Update for Swift 4 (Xcode 9): In Swift 4, the standard library provides methods for converting between Swift String ranges and NSString ranges, so calculations simplify

 let text = "@ericd Some text. 😺✌🏻 @apero" let pos = 22 let len = 6 // Compute String.UnicodeScalarView indices for first and last position: let fromIdx = text.unicodeScalars.index(text.unicodeScalars.startIndex, offsetBy: pos) let toIdx = text.unicodeScalars.index(fromIdx, offsetBy: len) // Compute corresponding NSRange: let nsRange = NSRange(fromIdx..<toIdx, in: text) 
+9
source

Source: https://habr.com/ru/post/1014800/


All Articles