How to extract numbers from a string?

My string contains a path

string="toto.titi.12.tata.2.abc.def" 

I want to extract only numbers from this line.

To extract the first number:

 tmp="${string#toto.titi.*.}" num1="${tmp%.tata*}" 

To extract the second number:

 tmp="${string#toto.titi.*.tata.*.}" num2="${tmp%.abc.def}" 

So, to extract the parameter, I have to do it in 2 steps. How to extract a number with one step?

+4
source share
8 answers

To extract all individual numbers and print a single numeric word per line, use -

 tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed 's/ /\n/g' 

Structure:

  • Replaces all line breaks with spaces: tr '\n' ' '
  • Replaces all non-numbers with spaces: sed -e 's/[^0-9]/ /g'
  • Remove leading white space: -e 's/^ *//g'
  • Delete space: -e 's/ *$//g'
  • Compress spaces in a sequence to 1 space: tr -s ' '
  • Replace the remaining space separators with line breaks: sed 's/ /\n/g'

Example:

 echo -e " this 20 is 2sen\nten324ce 2 sort of" | tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed 's/ /\n/g' 

Will be printed

 20 2 324 2 
+8
source

You can use tr to delete all characters other than numbers, for example:

 echo toto.titi.12.tata.2.abc.def | tr -d -c 0-9 
+6
source

Parameter expansion seems to be the order of the day.

 $ string="toto.titi.12.tata.2.abc.def" $ read num1 num2 <<<${string//[^0-9]/ } $ echo "$num1 / $num2" 12 / 2 

This, of course, depends on the format of $string . But at least for the example you provided, it works.

It could be higher than awb's solution for anubhava, which requires a subshell. I also like the chepner solution, but regular expressions are β€œheavier” than parameter extensions (although obviously more accurate). (Note that in the expression above, [^0-9] may look like an atom of regular expressions, but it is not.)

You can read about this form or parameter extension in the bash man page. Note that ${string//this/that} (as well as <<< ) is a bagism and is not compatible with traditional Bourne or posix shells.

+3
source

Using awk:

 arr=( $(echo $string | awk -F "." '{print $3, $5}') ) num1=${arr[0]} num2=${arr[1]} 
+2
source

You can also use sed:

 echo "toto.titi.12.tata.2.abc.def" | sed 's/[0-9]*//g' 

Here sed replaces

  • any digits (class [0-9] )
  • repeated any number of times ( * )
  • without anything (nothing between the second and third / ),
  • and g means globally.

The output will be:

 toto.titi..tata..abc.def 
+2
source

It would be easier to answer if you provided exactly the result that you want to get. If you mean that you want to get only the numbers from the string and delete everything else, you can do this:

 d@AirBox :~$ string="toto.titi.12.tata.2.abc.def" d@AirBox :~$ echo "${string//[az,.]/}" 122 

If you clarify a little, I can help more.

0
source

Use regex:

 string="toto.titi.12.tata.2.abc.def" [[ $string =~ toto\.titi\.([0-9]+)\.tata\.([0-9]+)\. ]] # BASH_REMATCH[0] would be "toto.titi.12.tata.2.", the entire match # Successive elements of the array correspond to the parenthesized # subexpressions, in left-to-right order. (If there are nested parentheses, # they are numbered in depth-first order.) first_number=${BASH_REMATCH[1]} second_number=${BASH_REMATCH[2]} 
0
source

Hi, adding another way to do this using 'cut',

 echo $string | cut -d'.' -f3,5 | tr '.' ' ' 

This gives you the following result: 12 2

0
source

Source: https://habr.com/ru/post/1493690/


All Articles