Conditional split in Perl

I have the following suggestions:

my $sent = 'D. discoideum and D. purpureum developmental programs revealed'; 

Is there a way to break the lines so that two consecutive words have. (dot) between them will be considered as one word?

Therefore, we hope to get this after splitting:

 $VAR = ['D. discoideum', 'and', 'D. purpureum', 'developmental', 'programs', 'revealed']; 

The standard s/\s+//g will split everything based on space.

+4
source share
2 answers

Try to divide by:

 /(?<!\.)\s+/ 

This expression matches any space that does not follow a period, without matching the period itself.

+9
source

Without splitting using regex:

 my @words = $sent =~ /(\S+\.\s+\S+|\S+)/g; 
+2
source

Source: https://habr.com/ru/post/1381060/


All Articles