Detecting Japanese characters in Java strings

I am trying to detect if a java string contains Japanese characters. Since it doesn't matter to me if the characters form a grammatically correct sentence, I thought that I would use a regular expression to match any Japanese character in a string like this:

package de.cg.javatest; import java.util.regex.Matcher; import java.util.regex.Pattern; public class JavaTest { public static void main(String[] args) { String aString = "γͺにげγͺいζ—₯々。"; Pattern pat = Pattern.compile("[\\p{InHiragana}]"); Matcher m = pat.matcher(aString); System.out.println(m.matches()); // false } } 

However, the print statement always shows false . I tried changing the template to

 [\\p{IsHiragana}] [\\p{InHiragana}]+ 

and I also manually entered the codes. Is there something I don’t see, or do I need to take a different approach?

+6
source share
1 answer

Matcher.matches returns true only when the pattern matches the entire string. As the anonymous author noted, not all symbols are Hiragana symbols.

By changing the template as follows, you can check if there is a Hiragana.

 Pattern pat = Pattern.compile(".*\\p{InHiragana}.*"); 

Using Matcher.find , you do not need to modify the template.

 Pattern pat = Pattern.compile("\\p{InHiragana}"); // [..] is not needed. Matcher m = pat.matcher(aString); System.out.println(m.find()); // true 
+4
source

Source: https://habr.com/ru/post/975928/


All Articles