Transliteration from Cyrillic to Latin ICU4j java

I need to do something fairly simple, but without the hardcoding hash mapping.

I have String s and it is written in Cyrillic. I need an example of how to turn it into Latin characters using a special sorting filter (to give a purely Latin example, so as not to confuse anyone, if String s = sniff, I want it to look for sniffing and turn them into something else ( there may be combinations).

I see that ICU4j can do such things, but I have no idea how to achieve it, since I cannot find any working examples (or I'm just stupid).

Any help is appreciated.

thanks

Best wishes,

PS I need a batch translation. I don't like styles or dynamic transliteration, just some basic example of what the ICU4j transliterator will look like.

K I really got it.

import com.ibm.icu.text.Transliterator; public class BulgarianToLatin { public static String BULGARIAN_TO_LATIN = "Bulgarian-Latin/BGN"; public static void main(String[] args) { String bgString = ""; Transliterator bulgarianToLatin = Transliterator.getInstance(BULGARIAN_TO_LATIN); String result1 = bulgarianToLatin.transliterate(bgString); System.out.println("Bulgarian to Latin:" + result1); } } 

Also the latest rule-based transliteration editing (if you do not want to use a pre-existing one once or just want something made to order)

 import com.ibm.icu.text.Transliterator; public class BulgarianToLatin { public static String BULGARIAN_TO_LATIN = "Bulgarian-Latin/BGN"; public static void main(String[] args) { String bgString = "                            \n  "; String rules="::[--ѢѣѪѫ];" + " > B;" + " > b;" + " > V;" + " > TS;" + " > Ts;" + " > ch;" + " > SHT;" + " > Sht;" + " > sht;" + "{}[[----][ѣѫ]] > Sh;" + " > YA;" + " > ya;"; Transliterator bulgarianToLatin = Transliterator.createFromRules("temp", rules, Transliterator.FORWARD); String result1 = bulgarianToLatin.transliterate(bgString); System.out.println("Bulgarian to Latin:" + result1); } } 
+6
source share
1 answer

I wrote a method for transliterating the Cyrillic alphabet into Latin, perhaps it would be useful to smb.

 public static String transliterate(String message){ char[] abcCyr = {' ','','','','','','','', '','','','','','','','','','','','','','','','', '','', '','','','','','', '','','','','','','','','', '','','','','','','','','','','','','','','','', '', '','', '','','','','','','','a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'}; String[] abcLat = {" ","a","b","v","g","d","e","e","zh","z","i","y","k","l","m","n","o","p","r","s","t","u","f","h","ts","ch","sh","sch", "","i", "","e","ju","ja","A","B","V","G","D","E","E","Zh","Z","I","Y","K","L","M","N","O","P","R","S","T","U","F","H","Ts","Ch","Sh","Sch", "","I", "","E","Ju","Ja","a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z","A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"}; StringBuilder builder = new StringBuilder(); for (int i = 0; i < message.length(); i++) { for (int x = 0; x < abcCyr.length; x++ ) { if (message.charAt(i) == abcCyr[x]) { builder.append(abcLat[x]); } } } return builder.toString(); } 
+4
source

Source: https://habr.com/ru/post/943862/


All Articles