Matlab - How to compare two lines by letter?

Essentially, I have two lines of the same length, say, "AGGTCT" and "AGGCCT" for example. I want to compare them by position and get readings when they do not match. Therefore, I would like to get 1 way out, because there is only 1 position in which they do not coincide in position 4. If someone has ideas for a positional comparison code that would help me a lot.

Thanks!

+4
source share
3 answers

Use the following syntax to get the number of disparate characters for strings of the same size:

sum( str1 ~= str2 ) 

If you want to be case insensitive, use:

 sum( lower(str1) ~= lower(str2) ) 

The expression str1 ~= str2 compares the char -by-char of two strings, giving a logical vector of the same size as the strings, with true where they do not match (using ~= ) and false where they match. To get the result, simply sum the number of true values ​​(inconsistencies).

EDIT: if you want to count the number of matching characters:

  • Use the "equal to" == operator (instead of the "not equal" ~= operator):

     sum( str1 == str2 ) 
  • Subtract the number of discrepancies from the total:

     numel(str1) - sum( str1 ~= str2 ) 
+11
source

You can compare the whole element of a string:

 r = all(seq1 == seq2) 

This will compare char with char and returns true if all elements in the resulting array are true. If the strings can have different sizes, you can compare the sizes first. Alternative is

 r = any(seq1 ~= seq2) 

Another solution is to use strcmp :

 r = strcmp(seq1, seq2) 
+1
source

I just want to note that you are asking to calculate the distance of hamming (since you are asking for alternatives - the article contains links to some). This has already been said here . In short, the built-in pdist command can do this.

0
source

Source: https://habr.com/ru/post/1481169/


All Articles