Checking if a character is part of the latin alphabet?

I need to check if the character is a letter or a space before moving on with processing. So i

for (Character c : take.toCharArray()) { if (!(Character.isLetter(c) || Character.isSpaceChar(c))) continue; data.append(c); 

As soon as I looked through the data, I saw that it contains characters that look like single-code character representations due to the Latin alphabet. How can I change the above code to fix my conditions, to accept only letters that fall into the range [az] [AZ]?

Is Regex a way, or is there a better (faster) way?

+4
source share
3 answers

If you just want to cut out non-ASCII letters, then a quick approach is to use String.replaceAll() and Regex:

 s.replaceAll("[^a-zA-Z]", "") 

You can’t say anything about performance compared to a character using character scanning and add to StringBuilder .

+2
source

If you specifically want to process only these 52 characters, then just process them:

 public static boolean isLatinLetter(char c) { return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z'); } 
+14
source

I would use the regular expression that you specified for this. It is easy to read and should be pretty fast (especially if you statically highlight it).

+1
source

Source: https://habr.com/ru/post/1394937/


All Articles