How can you check for specific characters in a string?

When I run the program, it always prints true. For example, if I enter AAJJ, it will return true because it checks if the first letter is true. can someone point me in the right direction? Thank!

squence_str = raw_input("Enter either A DNA, Protein or RNA sequence:")

def DnaCheck():

    for i in (squence_str):
        if string.upper(i) =="A":
            return True
        elif string.upper(i) == "T":
            return True
        elif string.upper(i) == "C":
            return True
        elif string.upper(i) == "G":
            return True
        else:
            return False

print "DNA ", DnaCheck()
+4
source share
3 answers

You need to check that all bases in the DNA sequence are valid.

def DnaCheck(sequence):
    return all(base.upper() in ('A', 'C', 'T', 'G') for base in sequence)
+7
source

I like @Alexander, but for a change you could see

def dna_check(sequence):
    return set(sequence.upper()).issubset("ACGT")
    # another possibility:
    # return set(sequence).issubset("ACGTacgt")

it can be faster with long sequences, especially if the probability of being a legal sequence is good (i.e. most of the time you will have to iterate over the entire sequence).

+2

...

You have the reverse logic. You must check all positions. If any of them cannot identify itself as a nucleotide in "ACTG", then you immediately return False for the string. If you passed all the characters, can you confidently return True .

import string

def DnaCheck(squence_str):

    for i in (squence_str):
        if string.upper(i) not in "ACTG":
            return False

    return True

test_cases = ["", "AAJJ", "ACTG", "AACTGTCAA", "AACTGTCAX"]
for strand in test_cases:
    print strand, DnaCheck(strand)

Conclusion:

 True
AAJJ False
ACTG True
AACTGTCAA True
AACTGTCAX False
0
source

Source: https://habr.com/ru/post/1685637/


All Articles