Which template should be used in java.util.Scanner to get the next string identifier?

In the line of text, I have " *(,identifier1*(identifier2 ", and I want to read the identifiers defined as the characters of the word ( [a-zA-Z_0-9] ).

Which should i use? I was thinking about using:

 scanner.next( "[\\w]+"); 

but i get java.util.InputMismatchException exception

+4
source share
2 answers

The default separator for the scanner is spaces, so the first (and only) token in your scanner object is the whole string "*(,identifier1*(identifier2" . This is the string you are trying to get by calling next("[\\w]+") , which throws an exception because it does not match your input.

Why do you prefer findInLine("\\w+") :

 Scanner scan = new Scanner("*(,identifier1*(identifier2"); System.out.println(scan.findInLine("\\w+")); System.out.println(scan.findInLine("\\w+")); 

which produces:

 identifier1 identifier2 

Or, if you want to split the input string into one or more alpha characters (ascii) alpha-num-chars (and _ ), try:

 Scanner scan = new Scanner("*(,identifier1*(identifier2").useDelimiter("\\W+"); while(scan.hasNext()) { System.out.println(scan.next()); } 

which produces the same conclusion as before.

Note that I used capital W , which is equal to:

 \W == [^\w] == [^a-zA-Z0-9_] 
+5
source

If there is no reason to use the scanner, you can get the string from all sides, and then extract the words directly. Of course, this immediately loads all the words into memory, while with the scanner they are read one at a time:

 import java.util.ArrayList; import java.util.List; import java.util.regex.Matcher; import java.util.regex.Pattern; public class Test { public static void main(String[] args) { List<String> words = extractWords("*(,identifier1*(identifier2"); for (String word : words) System.out.println(word); } public static List<String> extractWords(String input) { List<String> out = new ArrayList<String>(); Pattern re = Pattern.compile("\\w+"); Matcher matcher = re.matcher(input); while (matcher.find()) out.add(matcher.group()); return out; } } 

Generates output:

 identifier1 identifier2 
+1
source

Source: https://habr.com/ru/post/1338583/


All Articles