Change regex program excluding spaces

Question

Change regex program excluding spaces

I have a statement that finds strings containing one character, for example P. This works when matching with a string limited to without spaces

eg.

APAXA

Thr regex ^[^P]*P[^P]*$

He selects this line perfectly, however, what if I have a line

 XPA DREP EDS

What will be the regular expression for identifying all the lines in one line that matches the condition (the lines are always separated by some white space - tab, space, etc.)?

eg. how would I highlight XPA and DREP

I use while(m.find()) to loop several times and System.out.println (m.group ())

therefore m.group must contain the entire string.

+1

java regex

dr85 Jan 20 '11 at 13:57

source share

6 answers

why should it be too complex regex?

 String string = "XPA DREP EDS"; String[] s = string.split("\\s+"); for( String str: s){ if ( str.contains("P") ){ System.out.println( str ); } }

+1

ghostdog74 Jan 20 '11 at 14:37

source share

you can try and use the \s pattern (space matching). Take a look at regexp for java.

0

hellatan Jan 20 '11 at 14:03

source share

 \b[^P\s]*P[^P\s]*\b

will match all words that contain exactly one P. Remember to double the backslash when constructing your regular expression from a Java string.

Explanation:

 \b # Assert position at start/end of a word [^P\s]* # Match any number of characters except P and whitespace P # Match a P [^P\s]* # Match any number of characters except P and whitespace \b # Assert position at start/end of a word

Note that \b does not match all word boundaries when working with a Unicode string (thanks tchrist for reminding me). If so, you can replace \b with (don't look):

 (?:(?<=[\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])(?![\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])|(?<![\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])(?=[\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]]))

(taken from the answer to this question )

0

Tim pietzcker Jan 20 '11 at 14:08

source share

Thr reex being ^ [^ P] P [^ P] $

Such a regular expression finds only a string containing exactly one P, which may or may not be what you want. I guess you want .*P.* instead.

To find all words containing at least one P, you can use \\S+P\\S+ , where \S denotes a non-empty character. You can use \w instead.

To find all words containing exactly one P, you can use [^\\sP]+P[^\\sP]+(?=\\s) , which is more complicated. Here \S stands for space, [^abc] matches all expectations for abc, (?=...) is lookahead. Without a glance, you will find two “words" in APBPC: APB and PC.

0

maaartinus Jan 20 '11 at 14:13

source share

Try adding space characters ( \s ) in your negative character classes, and you'll also want to remove the ^ and $ bindings:

 [^P\s]*P[^P\s]*

or as a Java string literal:

 "[^P\\s]*P[^P\\s]*"

Please note that the above says that it does not work in Unicode, only ASCII (as indicated in tchrist comments).

0

Bart kiers Jan 20 '11 at 14:24

source share

jzd · Accepted Answer · 2011-01-20T13:59:30+0000

Separate it with a space, and then check each token for an existing regular expression.

Change regex program excluding spaces

More articles: