Why doesn't this simple regular expression match what I think?

I have a data file that looks like this. I added '%' instead of \t , the tab control character.

 1234:56% Alice Worthington alicew% Jan 1, 2010 10:20:30 AM% Closed% Development Digg: Reddit: Update%% file-one.txt% 1.1% c:/foo/bar/quux Add%% file-two.txt% 2.5.2% c:/foo/bar/quux Remove%% file-three.txt% 3.4% c:/bar/quux Update%% file-four.txt% 4.6.5.3% c:/zzz ... many more records of the above form 

The entries that interest me are lines starting with Update, Add, Delete, etc. I will not know that lines begin ahead of time or how many lines precede them. I know that they always start with a string of letters followed by two tabs. So I wrote this regex:

 generate-report-for 1234:56 | egrep "^[[:alpha:]]+\t\t.+" 

But this corresponds to zero lines. Where am I wrong?

Edit: I get the same results if I use '...' or "..." for an egrep expression, so I'm not sure if this is a shell.

+4
source share
4 answers

Apparently \t not a special character for egrep. You can use grep -P to enable the Perl-compatible regular expression engine, or insert literal tabs with Ctrl v Ctrl i

Better yet, you could use a great ack

+3
source

It looks like the shell parses "\ t \ t" before sending it to egrep. Instead, try "\\ t \\ t" or "\ t \ t". These are two slashes in double quotes and one in single quotes.

0
source

The file may not be what you see. Control characters may be hidden. It happens sometimes. My suggestion is that you are debugging this. First, reduce to the minimum regular expression pattern that matches, and then continue to add material one by one until you find the problem:

 egrep "[[:alpha:]]" egrep "[[:alpha:]]+" egrep "[[:alpha:]]+\t" egrep "[[:alpha:]]+\t\t" egrep "[[:alpha:]]+\t\t.+" egrep "^[[:alpha:]]+\t\t.+" 

There are variations in this sequence, depending on what you learn at each step. In addition, the first step can indeed be skipped, but this is only to show the technique.

0
source

you can use awk

 awk '/^[[:alpha:]]\t\t/' file 
0
source

Source: https://habr.com/ru/post/1306152/


All Articles