Convert Unicode character or its XML / HTML objects to its Unicode number in Swift

If either the unicode character is like String, or its XML / HTML entities, how can you create its Unicode number? For example, if you are given a string "෴"and you can generate its HTML ( ෴) code , how could you then generate your Unicode number ( U+0DF4)?

I am currently creating HTML objects using the API CFStringTransformand using kCFStringTransformToXMLHexfor conversion. But for Unicode number, there is no conversion.

+4
source share
2 answers

update: Xcode 9 • Swift 4

extension String {
    var html2AttributedString: NSAttributedString? {
        do {
            return try NSAttributedString(data: Data(utf8), options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding: String.Encoding.utf8.rawValue], documentAttributes: nil)
        } catch {
            print(error)
            return nil
        }
    }
    var unicodes: [UInt32] { return unicodeScalars.map{$0.value} }
}

Xcode 8 • Swift 3

extension String {
    var html2AttributedString: NSAttributedString? {
        do {
            return try NSAttributedString(data: Data(utf8), options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: String.Encoding.utf8.rawValue], documentAttributes: nil)
        } catch {
            print(error)
            return nil
        }
    }
    var unicodes: [UInt32] { return unicodeScalars.map{$0.value} }
}

let str = "<span>&euro;€</span>".html2AttributedString?.string ?? ""
print(str.unicodes)     // [8364, 8364]
+6

SwiftSoup . SwiftSoup - Swift, - (macOS, iOS, tvOS, watchOS Linux!)

let text = "Hello &<> Å å π 新 there ¾ © »"

print(Entities.escape(text))
print(Entities.unescape(text))

print(Entities.escape(text, OutputSettings().encoder(String.Encoding.ascii).escapeMode(Entities.EscapeMode.base)))
print(Entities.escape(text, OutputSettings().charset(String.Encoding.ascii).escapeMode(Entities.EscapeMode.extended)))
print(Entities.escape(text, OutputSettings().charset(String.Encoding.ascii).escapeMode(Entities.EscapeMode.xhtml)))
print(Entities.escape(text, OutputSettings().charset(String.Encoding.utf8).escapeMode(Entities.EscapeMode.extended)))
print(Entities.escape(text, OutputSettings().charset(String.Encoding.utf8).escapeMode(Entities.EscapeMode.xhtml)))

:

"Hello &amp;&lt;&gt; Å å π 新 there ¾ © »"
"Hello &<> Å å π 新 there ¾ © »"


"Hello &amp;&lt;&gt; &Aring; &aring; &#x3c0; &#x65b0; there &frac34; &copy; &raquo;"
"Hello &amp;&lt;&gt; &angst; &aring; &pi; &#x65b0; there &frac34; &copy; &raquo;"
"Hello &amp;&lt;&gt; &#xc5; &#xe5; &#x3c0; &#x65b0; there &#xbe; &#xa9; &#xbb;"
"Hello &amp;&lt;&gt; Å å π 新 there ¾ © »"
"Hello &amp;&lt;&gt; Å å π 新 there ¾ © »"
0

Source: https://habr.com/ru/post/1607388/


All Articles