Question R-regexp

I need to re-form my data frame using regexp and in particular this kind of lines

X21_GS04.A.mzdata 

it should be:

 GS04.A 

I tried

 pluto <- sub('^X[0-90_]+','', my.data.frame$File.Name, perl=TRUE) 

and it works; than i tried

 pluto <- sub('.mzdata$','', my.data.frame$File.Name, perl=TRUE) 

and it works too.

The problem is that I have no idea how to combine the two codes in one, I tried a script like this

 pluto <- sub('^X[0-90_]+ | .mzdata$','', my.data.frame$File.Name, perl=TRUE) 

but nothing appears. Can someone tell me where I am going wrong?

Best Riccardo

+6
source share
2 answers

Delete a place in a regular expression. Also exit . char: \. , i.e:

 ^X[0-9]+_|\.mzdata$ 
+2
source

The regular expression youre after is the following:

 ^X\d+_(.*)\.mzdata$ 

This will fit your entire expression and capture the part you want to keep in the group. Now you can replace this with \1 (link to capture group).

In R, it will be:

 result <- sub('^X\\d+_(.*)\\.mzdata$', '\\1', my.data.frame$File.Name, perl=TRUE) 
+9
source

Source: https://habr.com/ru/post/893344/


All Articles