Regex that finds consecutive capitalized letters

I am looking for a regular expression that can identify in a sentence that consecutive words in a sentence begin with capital letters.

If you take the text below:

The AZ Group is a long-standing market leader in providing information to the international air cargo community, as well as for protection and the security sector through the BDEC Limited, publishers of the British Defense Equipment Catalog and the British Defense Catalog.

I want to get the following:

Group AZ

BDEC Limited Defense Equipment

British Protection Catalog

Industry Defense Industry

Is this possible with regex? If so, can anyone suggest it?

+3
source share
4

(: .)

/([A-Z][\w-]*(\s+[A-Z][\w-]*)+)/

, .

ruby-1.9.2-p0 > %Q{The A-Z Group is a long-established market leader in the provision of information for the global air cargo community, and also for the defence and security sectors through BDEC Limited, publishers of the British Defence Equipment Catalogue and British Defence Industry Directory.}.scan(/([A-Z][\w-]*(\s+[A-Z][\w-]*)+)/).map{|i| i.first}

=> ["The A-Z Group", "BDEC Limited", "British Defence Equipment Catalogue", "British Defence Industry Directory"]

+9

, , , , :

([A-Z][a-zA-Z0-9-]*[\s]{0,1}){2,}

: , // ( , ), .

: , , , , : p

: , 動靜 out, THE, , . :

([A-Z][a-zA-Z0-9-]*)([\s][A-Z][a-zA-Z0-9-]*)+
+4

. ? "", " ", "

"word" " ", .

, , .

+3
$mystring = "the United States of America has many big cities like New York and Los Angeles, and others like Atlanta";

@phrases = $mystring =~ /[A-Z][\w'-]\*(?:\s+[A-Z][\w'-]\*)\*/g;

print "\n" . join(", ", @phrases) . "\n\n# phrases = " . scalar(@phrases) . "\n\n";

:

$ ./try_me.pl

United States, America, New York, Los Angeles, Atlanta

\# phrases = 5
+1
source

Source: https://habr.com/ru/post/1773379/


All Articles