Finding a Unicode Script Char in Haskell

Question

Finding a Unicode Script Char in Haskell

I wanted to write a function that checks what Charthe Cyrillic alphabet represents, solely for pedagogical reasons. A simple approximation for a Russian is

isCyrillic c = 
    let lc = toLower c 
    in '' <= lc && lc <= ''

but I don’t like it because it doesn’t process other languages that use Cyrillic. I could hard set ranges:

U+0400–U+04FF Cyrillic
U+0500–U+052F Cyrillic Supplement
U+2DE0–U+2DFF Cyrillic Extended-A
U+A640–U+A69F Cyrillic Extended-B
U+1C80–U+1C8F Cyrillic Extended-C

but this is not good practice either.

Ideally, the function would be simple

isCyrillic c = unicodeScript c == Cyrillic

but this assumes the existence of a type of enumerated Unicode scripts (Unicode ranges will also work). Is there anywhere?

+4

unicode haskell

Alexey romanov Mar 6 '18 at 18:37

source share

1 answer

duplode · Accepted Answer · 2018-03-06T18:47:22+0000

propertyfrom text-icu Data.Text.ICU.Charseems to match the score:

 import Data.Text.ICU.Char

isCyrilic c =    Block c ==

Finding a Unicode Script Char in Haskell

More articles: