How to get all dna coding for a peptide in C #

Hi, my head is boiling now for 3 days! I want to get all the DNA encodings for the peptide: the peptide is a sequence of amino acids, i.e. amino acid M , and amino acid Q can form an MQ or QM peptide

DNA coding means that for each amino acid there is a DNA code (called a codon) (for some, there is more than one code, i.e. amino acid T has 4 different code / codons)

The last function in the following code does not work, so I want someone to make it work for me and please do not require an integrated query language (I forgot its abbreviation!) `

 private string[] CODONS ={ "TTT", "TTC", "TTA", "TTG", "TCT", "TCC", "TCA", "TCG", "TAT", "TAC", "TGT", "TGC", "TGG", "CTT", "CTC", "CTA", "CTG", "CCT", "CCC", "CCA", "CCG", "CAT", "CAC", "CAA", "CAG", "CGT", "CGC", "CGA", "CGG", "ATT", "ATC", "ATA", "ATG", "ACT", "ACC", "ACA", "ACG", "AAT", "AAC", "AAA", "AAG", "AGT", "AGC", "AGA", "AGG", "GTT", "GTC", "GTA", "GTG", "GCT", "GCC", "GCA", "GCG", "GAT", "GAC", "GAA", "GAG", "GGT", "GGC", "GGA", "GGG", }; private string[] AMINOS_PER_CODON = { "F", "F", "L", "L", "S", "S", "S", "S", "Y", "Y", "C", "C", "W", "L", "L", "L", "L", "P", "P", "P", "P", "H", "H", "Q", "Q", "R", "R", "R", "R", "I", "I", "I", "M", "T", "T", "T", "T", "N", "N", "K", "K", "S", "S", "R", "R", "V", "V", "V", "V", "A", "A", "A", "A", "D", "D", "E", "E", "G", "G", "G", "G", }; public string codonToAminoAcid(String codon) { for (int k = 0; k < CODONS.Length; k++) { if (CODONS[k].Equals(codon)) { return AMINOS_PER_CODON[k]; } } // never reach here with valid codon return "X"; } public string AminoAcidToCodon(String aminoAcid) { for (int k = 0; k < AMINOS_PER_CODON .Length; k++) { if (AMINOS_PER_CODON [k].Equals(aminoAcid )) { return CODONS[k]; } } // never reach here with valid codon return "X"; } public string GetCodonsforPeptide(string pep) { string result = ""; for (int i = 0; i <pep.Length ; i++) { result = AminoAcidToCodon(pep.Substring (i,1) ); for (int q = 0; q < pep.Length; q++) { result += AminoAcidToCodon(pep.Substring(q, 1)); } } return result; } 
+5
source share
1 answer

Try using the following two methods:

 public IEnumerable<string> AminoAcidToCodon(char aminoAcid) { for (int k = 0; k < AMINOS_PER_CODON.Length; k++) { if (AMINOS_PER_CODON[k] == aminoAcid) { yield return CODONS[k]; } } } public IEnumerable<string> GetCodonsforPeptide(string pep) { if (string.IsNullOrEmpty(pep)) { yield return string.Empty; yield break; } foreach (var codon in AminoAcidToCodon(pep[0])) foreach (var codonOfRest in GetCodonsforPeptide(pep.Substring(1))) yield return codon + codonOfRest; } 

Notes:

  • Since each amino acid will have several suitable codons, your method, which returns when it finds the first, will ever match each amino acid once. Instead, I created an enumerator method that will yield return for each matching codon.
  • The last method finds all suitable codons for the first symbol of the peptide and combines each such codon with all codons consisting of the rest of the peptide after the first symbol.
  • I used the array AMINOS_PER_CODON instead of char as a type. You can easily change the code to use your string array if you want.
  • The best approach without two separate arrays would be to create a dictionary that maps each individual amino acid character to a list of code lines.

Example output for transmitting "MA" :

 ATGGCT ATGGCC ATGGCA ATGGCG 

This is due to the fact that M displays them:

 ATG 

and A mapped to them:

 GCT GCC GCA GCG 

The dictionary that I propose to use will look like this:

 var codonsByAminoAcid = new Dictionary<char, string[]> { { 'M', new[] { "ATG" } }, { 'A', new[] { "GCT", "GCC", "GCA", "GCG" } } }; 

This will replace the AminoAcidToCodon method.

You can even create this dictionary from two arrays:

 var lookup = CODONS .Zip(AMINOS_PER_CODON, (codon, amino) => new { codon, amino }) .GroupBy(entry => entry.amino) .ToDictionary( g => g.Key, g => g.Select(ge => ge.codon).ToArray()); 

The GetCodonsforPeptide method may look like this:

 public IEnumerable<string> GetCodonsforPeptide(string pep) { if (string.IsNullOrEmpty(pep)) { yield return string.Empty; yield break; } foreach (var codon in lookup(pep[0])) foreach (var codonOfRest in GetCodonsforPeptide(pep.Substring(1))) yield return codon + codonOfRest; } 

i.e. replace the call with this other method by the lookup table.

+2
source

Source: https://habr.com/ru/post/1206728/


All Articles