I get a string from an API that has anchored tags, so I create an NSAttributedString from it and display it in a UITextView to support supported links.
The problem is that the input string is invalid HTML, so it has unescaped Unicode characters in it. Such things as:
- HORIZONTAL ELLIPSIS Unicode: U + 2026, UTF-8: E2 80 A6
- EM DASH Unicode: U + 2014, UTF-8: E2 80 94
While I was dealing with these specific cases, I am worried about any other Unicode characters that come in that I currently don't know about.
Example:
NSString *fromAPI = @"Reagan \U2014 saying"; NSDictionary *options = @{NSDocumentTypeDocumentAttribute : NSHTMLTextDocumentType}; NSData *data = [fromAPI dataUsingEncoding:NSUTF8StringEncoding allowLossyConversion:NO]; NSAttributedString *attributedString = [[NSAttributedString alloc] initWithData:data options:options documentAttributes:nil error:nil];
This displays in a UITextView as: 
How to get it to correctly display em dash and another unicode?
source share