Parse the string correctly and not lose the delimiter

I have a piece of code that basically translates English to text.

I am currently using the String.split() method and using \\\W as a delimiter, removing all non-word characters.

In its current form, this is what I get:

 input:I hate text speak!:) output:I h8 txt spk 

In any case, I do not lose the delimiters?

EDIT: Here is a method that does parsing. As he claims, it replaces the delimiter with space, at least with its still readable ...

 public static String engToText(String text){ text=text.toLowerCase(); String translated=" "; //breaks string into tokens String[] tokens = text.split("\\W"); for(int x=0;x<tokens.length;x++){ if(wordMapEng.containsKey(tokens[x])){ translated+=" "+wordMapEng.get(tokens[x]); }else{ translated+=" " + tokens[x]; } } return translated.trim(); } 
+4
source share
1 answer

You can use the StringTokenizer class, which has

 StringTokenizer(String str, String delim, boolean returnDelims) 

constuctor, which, when iterating over tokens, returns separators as well.

+6
source

Source: https://habr.com/ru/post/1442065/


All Articles