Rename ä, ö, ü to ae, oe, ue

We want to rename the lines so that "strange" characters, such as German umlauts, are translated into their official representation without umlauts. Does Java have some function for converting such characters (AKA handles the display) not only for German umlauts, but also for French, Czech or Scandinavian characters? The reason is to create a function that can rename files / directories that can be processed without problems on different platforms using Subversion.

This question is similar, but without a useful answer.

+1
source share
2 answers

You can use the Unicode \p{InCombiningDiacriticalMarks} block property to remove (most) diacritics from strings:

 public String normalize(String input) { String output = Normalizer.normalize(input, Normalizer.Form.NFD); Pattern pattern = Pattern.compile("\\p{InCombiningDiacriticalMarks}+"); return pattern.matcher(output).replaceAll(""); } 

This will not replace the German umlauts as you want. He will turn ö into o , ä into a and so on. But perhaps this is also good for you.

+1
source

Use the ICU Transliterator . This is a general class for performing these types of transliterations. You may need to provide your own card.

+3
source

Source: https://habr.com/ru/post/959455/


All Articles