How to use Unicode character combination with Kanji / Hanzi?

I am trying to find a workaround for displaying old and rare characters in Unicode using character combination. I am currently converting some dictionaries from EPWING to text, and there are 36 different characters that cannot be played using regular UTF-8. The following is the epwing gaiji problem section for single-code comparisons for one of the dictionaries that I am converting, in some areas it has an interesting syntax that is explicitly used to combine characters in different ways. I was hoping that someone could determine what the syntax was and where I could find documentation or a tutorial on how to use it.

s/<?w=b02a>/𡓦/g
s/<?w=b04b>/者/g
s/<?w=b064>/<⾱ 𤰇>/g
s/<?w=b077>/<彳<匕\/匕>>/g
s/<?w=b07c>/<山\/⺀>/g
s/<?w=b12e>/𥝝/g
s/<?w=b155>/</>/g
s/<?w=b156>/<\/>/g
s/<?w=b157>/<\/\/>/g
s/<?w=b158>/<こ[1]/と|ヿ>/g
s/<?w=b16f>/<㗢>/g
s/<?w=b170>/<㗥>/g
s/<?w=b171>/ଏ/g
s/<?w=b175>/lb/g
s/<?w=b22a>//g
s/<?w=b234>/ff/g
s/<?w=b25e>/㯌/g
s/<?w=b271>/<扌 晉>/g
s/<?w=b36b>/𣴴/g
s/<?w=b373>/𥝱/g
s/<?w=b42c>/𦼠/g
s/<?w=b434>/<已\/大>/g
s/<?w=b438>/𩸽/g
s/<?w=b43a>/𩺊/g
s/<?w=b43f>/<㇀/丶>/g
s/<?w=b440>/𠂆/g
s/<?w=b45a>/<?>/g
s/<?w=b45b>/<|>/g
s/<?w=b53d>/<?>/g
s/<?w=b53e>/<?>/g
s/<?w=b540>/<o>/g
s/<?w=b537>/<ト モ>/g
s/<?w=b541>/<一/𠔀>/g
s/<?w=b544>/<?>/g
s/<?w=b546>/<[r45]卐>/g
s/<?w=b55f>/*/g

, 彳 匕, 匕, :

s/<?w=b077>/<彳<匕\/匕>>/g

, 45 :

s/<?w=b546>/<[r45]卐>/g

: , ? w =, gwing, .

.

+3
1

. Unicode, 12.2, . .

, , , , .

+4

Source: https://habr.com/ru/post/1761083/


All Articles