I have something that is probably really dumb grep in question R. Sorry, because it seems like it should be so simple - I obviously just missed something.
I have a row vector, let's call it alice . The following is part of alice :
T.8EFF.SP.OT1.D5.VSVOVA#4 T.8EFF.SP.OT1.D6.LISOVA#1 T.8EFF.SP.OT1.D6.LISOVA#2 T.8EFF.SP.OT1.D6.LISOVA#3 T.8EFF.SP.OT1.D6.VSVOVA#4 T.8EFF.SP.OT1.D8.VSVOVA#3 T.8EFF.SP.OT1.D8.VSVOVA#4 T.8MEM.SP#1 T.8MEM.SP#3 T.8MEM.SP.OT1.D106.VSVOVA#2 T.8MEM.SP.OT1.D45.LISOVA#1 T.8MEM.SP.OT1.D45.LISOVA#3
I want grep to give me the number after D that appears on some of these lines, provided that the line contains "LIS" and an empty line or something like that.
I was hoping grep would return me the value of the capture group, not the whole string. Here is my R-flavored regex:
pattern <- (?<=\\.D)([0-9]+)(?=.LIS)
nothing complicated. But to get what I need, instead of just using grep(pattern, alice, value = TRUE, perl = TRUE) , I do the following, which seems bad:
reg.out <- regexpr( "(?<=\\.D)[0-9]+(?=.LIS)", alice, perl=TRUE ) substr(alice,reg.out,reg.out + attr(reg.out,"match.length")-1)
Looking at it now, it doesn't seem too ugly, but the amount of unrest taken to get this completely trivial job was awkward. Any of the pointers on how to do this correctly?
Bonus signs to point to a web page that explains the difference between what I get with $ , @ and attr .
grep r
Mike Dewar Jun 03 '10 at 19:58 2010-06-03 19:58
source share