Regular expression for capital matching

def normalized? matches = match(/[^AZ]*/) return matches.size == 0 end 

This is my string function, checking that the string contains only uppercase letters. It works fine, except for non-matches, but when I call it a string like "ABC" , it does not mean a match, because obviously matches.size is 1 and not zero. It seems to have an empty element in it.

Can someone explain why?

+4
source share
8 answers

MatchData#size returns the number of capture groups in the regular expression plus one, so md[i] will refer to a valid iff i < md.size . Thus, the value returned by size depends only on the regular expression and not on the matched string and will never be 0.

Do you want matches.to_s.size or matches[0].size .

+2
source

Your regex is wrong - if you want it to match only uppercase strings, use /^[AZ]+$/ .

+3
source

Your regular expression is incorrect. /[^AZ]*/ means "combine zero or more characters that are not between A and Z anywhere on the line." The string ABC has null characters that are not between A and Z , so it matches the regular expression.

Change your regular expression to /^[^AZ]+$/ . This means that β€œmatches one or more characters that are not between A and Z , and make sure that each character between the beginning and end of the line is not between A and Z ”. Then the string ABC will not match, and then you can check matches[0].size or whatever, like the answer of sepp2k.

+3
source
 ruby-1.9.2-p180> def normalized? s ruby-1.9.2-p180?> s.match(/^[[:upper:]]+$/) ? true : false ruby-1.9.2-p180?> end => nil ruby-1.9.2-p180> normalized? "asdf" => false ruby-1.9.2-p180> normalized? "ASDF" => true 
+2
source

* in your regular expression means that it matches any number of characters other than uppercase, including zero. Therefore, he always matches everyone. The fix is ​​to remove * , then it will not match a string containing only uppercase characters. (Although you will need another test if strings with zero length are not allowed.)

0
source

There is only 1 regular expression that defines a string with only Everything :

def onlyupper(s)
(s =~ /^[AZ]+$/) != nil
end

Truth table:

 /[^AZ]*/: Testing 'asdf' matched 'asdf' length 4 Testing 'HHH' matched '' length 0 Testing '' matched '' length 0 Testing '-=AAA' matched '-=' length 2 -------- /[^AZ]+/: Testing 'asdf' matched 'asdf' length 4 Testing 'HHH' matched nil Testing '' matched nil Testing '-=AAA' matched '-=' length 2 -------- /^[^AZ]*$/: Testing 'asdf' matched 'asdf' length 4 Testing 'HHH' matched nil Testing '' matched '' length 0 Testing '-=AAA' matched nil -------- /^[^AZ]+$/: Testing 'asdf' matched 'asdf' length 4 Testing 'HHH' matched nil Testing '' matched nil Testing '-=AAA' matched nil -------- /^[AZ]*$/: Testing 'asdf' matched nil Testing 'HHH' matched 'HHH' length 3 Testing '' matched '' length 0 Testing '-=AAA' matched nil -------- /^[AZ]+$/: Testing 'asdf' matched nil Testing 'HHH' matched 'HHH' length 3 Testing '' matched nil Testing '-=AAA' matched nil -------- 
0
source

If you want to know that the input string consists entirely of English capital letters, that is, AZ, then you must delete Kleene Star, since it will match before and after each individual character in any input line (zero-length match). The statement !s[/[^AZ]/] indicates whether there is a match for characters other than A-to-Z:

 irb(main):001:0> def normalized? s irb(main):002:1> return !s[/[^AZ]/] irb(main):003:1> end => nil irb(main):004:0> normalized? "ABC" => true irb(main):005:0> normalized? "AbC" => false irb(main):006:0> normalized? "" => true irb(main):007:0> normalized? "abc" => false 
0
source

This question requires a clearer answer. As tchrist commented, I would like him to respond. " Regex for capital matching " is to use:

 /\p{Uppercase}/ 

As tchrist mentions, "differs from the general category \ p {Uppercase_Letter} aka \ p {Lu}. This is because there are non-letters that are considered upper"

0
source

Source: https://habr.com/ru/post/1345294/


All Articles