How to find common characters between two lines in bash?

Question

How to find common characters between two lines in bash?

For example:

s1="my_foo" s2="not_my_bar"

the desired result will be my_o . How to do this in bash?

+5

string bash shell

johannes Aug 04 2018-11-11T00:

source share

7 answers

Assuming strings don't contain inline strings:

 s1='my_foo' s2='my_bar' intersect=$( comm -12 <( fold -w1 <<< "$s1" | sort -u ) <( fold -w1 <<< "$s2" | sort -u ) | tr -d \\n ) printf '%s\n' "$intersect"

And one more:

 tr -dc "$s2" <<< "$s1"

+2

Dimitre Radoulov Aug 04 2018-11-11T00:

source share

late post, I just found this page:

 echo "$str2" | awk 'BEGIN{FS=""} { n=0; while(n<=NF) { if ($n == substr(test,n,1)) { if(!found[$n]) printf("%c",$n); found[$n]=1;} n++; } print ""}' test="$str1"

and one more, it creates a regular expression to match (note: doesn't work with special characters, but it's not that hard to fix with anonther sed)

 echo "$str1" | grep -E -o ^`echo -n "$str2" | sed 's/\(.\)/(|\1/g'; echo "$str2" | sed 's/./)/g'`

+2

Karoly Horvath Aug 07 2018-11-11T00:

source share

There should be a portable solution:

 s1="my_foo" s2="my_bar" while [ -n "$s1" -a -n "$s2" ] do if [ "${s1:0:1}" = "${s2:0:1}" ] then printf %s "${s1:0:1}" else break fi s1="${s1:1:${#s1}}" s2="${s2:1:${#s2}}" done

+1

l0b0 Aug 04 2018-11-11T00:

source share

 comm="" for ((i=0;i<${#s1};i++)) do if test ${s1:$i:1} = ${s2:$i:1} then comm=${comm}${s1:$i:1} fi done

+1

ajreal Aug 04 2018-11-11T00:

source share

Solution using single sed execution:

 echo -e "$s1\n$s2" | sed -e 'N;s/^/\n/;:begin;s/\n\(.\)\(.*\)\n\(.*\)\1\(.*\)/\1\n\2\n\3\4/;t begin;s/\n.\(.*\)\n\(.*\)/\n\1\n\2/;t begin;s/\n\n.*//'

Like all cryptic sed scripts, this requires an explanation in the form of a sed script file that can be run echo -e "$s1\n$s2" | sed -f script echo -e "$s1\n$s2" | sed -f script :

 # Read the next line so s1 and s2 are in the pattern space only separated by a \n. N # Put a \n at the beginning of the pattern space. s/^/\n/ # During the script execution, the pattern space will contain <result so far>\n<what left of s1>\n<what left of s2>. :begin # If the 1st char of s1 is found in s2, remove it from s1 and s2, append it to the result and do this again until it fails. s/\n\(.\)\(.*\)\n\(.*\)\1\(.*\)/\1\n\2\n\3\4/ t begin # When previous substitution fails, remove 1st char of s1 and try again to find 1st char of S1 in s2. s/\n.\(.*\)\n\(.*\)/\n\1\n\2/ t begin # When previous substitution fails, s1 is empty so remove the \n and what is left of s2. s/\n\n.*//

If you want to remove the duplicate, add the following at the end of the script:

 :end;s/\(.\)\(.*\)\1/\1\2/;t end

Edit: I understand that the solution for a clean dogbane shell has the same algorithm and is probably more efficient.

+1

jfg956 Aug 04 2018-11-11T00:

source share

Since everyone loves perl single-line full punctuation:

perl -e '$a{$_}++ for split "",shift; $b{$_}++ for split "",shift; for (sort keys %a){print if defined $b{$_}}' my_foo not_my_bar

Generates hashes %a and %b from input strings.
Print any characters common to both lines.

outputs:

 _moy

0

Chris Koknat Nov 06 '15 at

source share

dogbane · Accepted Answer · 2011-08-04 13:18

In my solution below, fold used to split a line one character per line, sort to sort lists, comm to compare two lines, and finally tr to remove new line characters

 comm -12 <(fold -w1 <<< $s1 | sort -u) <(fold -w1 <<< $s2 | sort -u) | tr -d '\n'

Alternatively, this is a pure Bash solution (which also supports character order). It iterates over the first line and checks to see if each character is on the second line.

 s="temp_foo_bar" t="temp_bar" i=0 while [ $i -ne ${#s} ] do c=${s:$i:1} if [[ $result != *$c* && $t == *$c* ]] then result=$result$c fi ((i++)) done echo $result

Fingerprints: temp_bar

How to find common characters between two lines in bash?

More articles: