sample = [['CGG','ATT'],['GCGC','TAAA']]
d1 = [[{'G': 0.66, 'C': 0.33}, {'A': 0.33, 'T': 0.66}], [{'G': 0.5, 'C': 0.5}, {'A': 0.75, 'T': 0.25}]]
d2 = [{('C', 'A'): 0.33, ('G', 'T'): 0.66}, {('G', 'T'): 0.25, ('C', 'A'): 0.5, ('G', 'A'): 0.25}]
Problem:
Consider the first pair: ['CGG', 'ATT']
How to calculate a, where a:
float(a) = (freq of pairs) - ((freq of C in CGG) * (freq of A in ATT))
eg. in CA pairs, float (a) = (freq of CA pairs) - ((freq of C in CGG) * (freq of A in ATT))
Output a = (0.33) - ((0.33) * (0.33)) = 0.222222
Calculation of “a” for any one combination (CA pair or GT pair)
Final Output for sample : a = [0.2222, - 0.125]
How to calculate b, where b:
float (b) = (float(a)^2)/ (freq of C in CGG) * (freq G in CGG) * (freq A in ATT) * (freq of T in ATT)
Output b = 1
Do it for the whole list
Final Output for sample : b = [1, 0.3333]
I do not know how to extract the required values from d1 and d2 and perform mathematical operations.
I tried to write the following code for the value
float a = {k: float(d1[k][0]) - d2[k][0] * d2[k][1]for k in d1.viewkeys() & d2.viewkeys()}
But that will not work. In addition, I prefer the for loop over the concepts
My attempt to write a (rather erroneous) for-loop for the above:
float_a = []
for pair,i in enumerate(d2):
for base,j in enumerate(d1):
float (a) = pair[i][0] - base[j][] * base[j+1][]
float_a.append(a)
float_b = []
for floata in enumerate(float_a):
for base,j in enumerate(d1):
float (b) = (float(a) * float(a)) - (base[j] * base[j+1]*base[j+2]*base[j+3])
float_b.append(b)