StringTokenizer is split into "<br/">"

Question

StringTokenizer is split into "<br/">"

I may be stupid, but I do not understand why the behavior of StringTokenizer is here:

import static org.apache.commons.lang.StringEscapeUtils.escapeHtml; String object = (String) value; String escaped = escapeHtml(object); StringTokenizer tokenizer = new StringTokenizer(escaped, escapeHtml("<br/>"));

If fx. value

 Hej<br/>$user.get(0).name Har vundet<br/><table border='1'><tr><th>Name</th><th>Played</th><th>Brewed</th></tr>#foreach( $u in $user )<tr><td>$u.name</td> <td>$u.played</td> <td>$u.brewed</td></tr>#end</table><br/>

Then the result

 Hej $use . e (0).name Ha vunde a eo de ='1' h Name h h P ayed h h B ewed h #fo each( $u in $use ) d $u.name d d $up ayed d d $u. ewed d #end a e

It makes no sense to me.

How can I make him behave as I expect.

+4

java

Anamuser Dec 29 '10 at 17:49

source share

5 answers

Each character in the string is considered a marker for separation. Thus, your code is broken into each character "&", "l", "t", ";", "b", "r", "/" or "g" (since escapeHtml will replace the character <"and"> "with < and > respectively).

You probably want to use String.split , which takes the regular expression as the item to be broken:

 String[] parts = object.split("<br/>");

or

 String[] parts = escaped.split(escapeHtml("<br/>"));

Just make sure your separator token does not have special regular expression characters.

+2

Cameron skinner Dec 29 '10 at 17:56

source share

If you want to separate a line / text with a word and not just a few characters, then you are better off using String.split

I did a test:

 public static void main(String[] args){ String s = "Hej<br/>$user.get(0).name Har vundet<br/><table border='1'><tr><th>Name</th><th>Played</th><th>Brewed</th></tr>#foreach( $u in $user )<tr><td>$u.name</td> <td>$u.played</td> <td>$u.brewed</td></tr>#end</table><br/>"; String[] lines = s.split("<br/>"); for(String ss:lines) System.out.println(ss); }

and here you have the result:

 Hej $user.get(0).name Har vundet <table border='1'><tr><th>Name</th><th>Played</th><th>Brewed</th></tr>#foreach( $u in $user )<tr><td>$u.name</td> <td>$u.played</td> <td>$u.brewed</td></tr>#end</table>

Tjena

+1

spuas Dec 29 '10 at 18:02

source share

StringTokenizer is broken using each character.

You need to use split. (be careful, as with regular expression)

 String[] lines = "some html string<br/>with line breaks<br/>".split("<br/>")

0

Will Dec 29 '10 at 17:55

source share

You cannot use a StringTokenizer with a multi-character delimiter. One possible solution to your problem is to replace the "<br>" character that you can guarantee will not appear on your string, and then we will use a StringTokenizer with that character as a separator.

0

murgatroid99 Dec 29 '10 at 17:55

source share

Dave jarvis · Accepted Answer · 2010-12-29T17:51:48+0000

From the documentation :

The characters of the delim argument are delimiters for separating tokens. Symbols of the separator itself will not be considered as tokens.

In other words, the characters that StringTokenizer when to separate a string:

<
b
t
/ <Lithium →

When it matches any of these characters in a string (the escaped variable in your code), the StringTokenizer instance breaks the result and discards the token. You can confirm this by noting that the letter r does not appear on the output.

Use String.split instead of others.

StringTokenizer is split into "<br/">"

More articles: