Sorry for the simple question about regexp, but I can't get what I need without what seems like a complicated solution to me. I am parsing a file containing a sequence of three letters A, E, D, as in
AADDEEDDA
EEEEEEEE
AEEEDEEA
AEEEDDAAA
and I would like to identify only those that start with E and end in D with just one change in sequence, for example, in
EDDDDDDDD
EEEDDDDDD
EEEEEEEED
I fight the right regex to do this. Here is my last attempt
echo "1,AAEDDEED,1\n2,EEEEDDDD,2\n3,EDEDEDED" | gawk -F, '{if($2 ~ /^E[(ED){1,1}]*D$/ && $2 !~ /^E[(ED){2,}]*D$/) print $0}'
which does not work. Any help?
Thanks in advance.
source
share