R regex beginning of line ^ in data frame values

Given:

test <- data.frame(Speed=c("2 Mbps", "10 Mbps")) 

Why does this regular expression match the following values:

  grepl("[0-9]*Mbps$", test[,"Speed"], ignore.case=TRUE) 

but does not match the following:

  grepl("^[0-9]*Mbps$", test[,"Speed"], ignore.case=TRUE) 

The ^ (start of line / line) character causes a problem, but why?

+6
source share
3 answers

The regular expression ^[0-9]*Mbps$ looks for a number at the beginning, and then for Mbps at the end. And since there are gaps between them, there is no coincidence. To match strings, use ^[0-9]*\\s*Mbps$ .

 test <- data.frame(Speed=c("2 Mbps", "10 Mbps")) grepl("^[0-9]*\\s*Mbps$", test[,"Speed"], ignore.case=TRUE) 

Demo program output :

 [1] TRUE TRUE 

[0-9]*Mbps$ matches only Mbps at the end of each element, because [0-9]* can match an empty string due to the quantifier * .

+4
source

Since there is no space in the regular expression;

"^[0-9]* Mbps$" or "^[0-9]*\\s*Mbps$" will match the inputs .


"[0-9]*Mbps$" matches (not necessarily from the beginning of the line) "zero occurrences of digit-characters, followed by" Mbps "and the end of the line."

"^[0-9]*Mbps$" does not match the inputs, because to enter you need to start with zero or more digits, then "Mbps" (no space!), And then the end of the line.

+3
source

The second version basically says that the only characters that can precede "MBPS" or "mbps" or "Mbps" (if any) are numbers. Take a look at the results of an extended data block with many features:

 > test <- data.frame(Speed=c("2 Mbps", "10 Mbps", "123Mbps", " Mbps", "aMbps", "Mbps")) > grepl("^[0-9]*Mbps$", test[,"Speed"], ignore.case=TRUE) [1] FALSE FALSE TRUE FALSE FALSE TRUE > grepl("[0-9]*Mbps$", test[,"Speed"], ignore.case=TRUE) [1] TRUE TRUE TRUE TRUE TRUE TRUE 

"trick" or "gotcha" here is that grepl("[0-9]*Mbps$", ...) really no different from grepl("Mbps$", ...) . This will match a whole string of characters that you probably don't need.

+2
source

Source: https://habr.com/ru/post/987962/


All Articles