Grep only the second part after the space

Question

Grep only the second part after the space

I have a parser in a shell script:

Here is the input file for analysis from (input.txt):

input.txt:
system.switch_cpus.commit.swp_count                 0                       # Number of s/w prefetches committed
  system.switch_cpus.commit.refs                2682887                       # Number of memory references committed
  system.switch_cpus.commit.loads               1779328                       # Number of loads committed                                                                                                                                                                                                                
  system.switch_cpus.commit.membars                   0                       # Number of memory barriers committed
  system.switch_cpus.commit.branches             921830                       # Number of branches committed
  system.switch_cpus.commit.vec_insts                 0                       # Number of committed Vector instructions.
  system.switch_cpus.commit.fp_insts                  0                       # Number of committed floating point instructions.
  system.switch_cpus.commit.int_insts          10000000                       # Number of committed integer instructions.

The script does the following:

 $ cpu1_name="system.switch_cpus"
 $ echo "$(grep "${cpu1_name}.commit.loads" ./input.txt |grep -Eo '[0-9]+')"
 correct expected output: 1779328

But in another file, the variable "cpu1_name" changes to "system.switch_cpus _1 " Executing the same script now gives me 2 values:

New input file:
system.switch_cpus_1.commit.swp_count               0                       # Number of s/w prefetches committed
 system.switch_cpus_1.commit.refs              2682887                       # Number of memory references committed
 system.switch_cpus_1.commit.loads             1779328                       # Number of loads committed                                                                                                                                                                                                               
 system.switch_cpus_1.commit.membars                 0                       # Number of memory barriers committed
 system.switch_cpus_1.commit.branches           921830                       # Number of branches committed
 system.switch_cpus_1.commit.vec_insts               0                       # Number of committed Vector instructions.
 system.switch_cpus_1.commit.fp_insts                0                       # Number of committed floating point instructions.   


Modified Script line:
$ cpu1_name="system.switch_cpus_1"
$ echo "$(grep "${cpu1_name}.commit.loads" ./new_input.txt |grep -Eo '[0-9]+')"
1
1779328

As you can see, grep traffic searches for any number and reports "1" due to the changed variable name.

Is there a way to select only the second part of the number (i.e. only 1779328)? I know what I can use awk'{print $2}, but that would mean changing a lot of lines in the script. So I wondered if there was a simpler trick with existing script lines.

Thanks in advance

+4

bash regex grep awk

user3285014 09 . '18 7:20

4

(, ), , . possitive-lookbehind (?<=pattern) possitive-lookahead (?=pattern), , .

, -P grep.

+1

Martin Heralecký 09 . '18 7:26

Awk ( ):

awk -v x="${cpu1_name}.commit.loads" '$1==x{print $2}' input.txt

POSIX awk.

$ awk -v x="${cpu1_name}.commit.loads" '$1==x{print $2}' input.txt
1779328
$ awk -v x="${cpu1_name}.commit.loads" '$1==x{print $2}' new_input.txt
1779328

-v x="${cpu1_name}.commit.loads"
awk x, , .
$1==x{print $2}
$1 x, $2.

+1

John1024 Mar 09 '18 at 7:29

source share

You can simply change the grep command to:

grep -oP '(?<=\s)[0-9]+'

To impose a space in front of a string of numbers, try even better:

grep -oP '(?<=\s)\d+'

or end up in grep -oP '(?<=\s)\d+(?=\s)'or ingrep -oP '(?<=\s)[0-9]+(?=\s)'

0

Allan Mar 09 '18 at 7:25

source share

Wiktor Stribiżew · Accepted Answer · 2018-03-09T09:12:43+0000

_ char, _ 1 . .

, , , . w \b \</\>, , grep:

grep -Ewo '[0-9]+'
grep -Eo '\b[0-9]+\b'
grep -Eo '\<[0-9]+\>'

-.

, sed :

sed -E 's/^\s*\S+\s+(\S+).*/\1/'

.

^ -
\s* - 0+
\S+ - 1+,
\S+ - 1 +
(\S+) - 1 + non-whitespace chars ( 1, , \1 )
.* - .

Grep only the second part after the space

More articles: