If you are familiar with python, I would use pandas . It uses "DataFrames" similar to R, so you can take the concept and apply it to R.
, data1 - , :
GeneName | ExpValue |
gene1 300.0
gene2 250.0
, DataFrame:
dfblood = pd.read_csv('path/to/data1',delimiter='\t')
dftissue = pd.read_csv('path/to/data2',delimiter='\t')
dftumor = pd.read_csv('path/to/data3',delimiter='\t')
merge DataFrame df.
dftmp = pd.merge(dfblood,dftissue,on='GeneName',how='inner')
df = pd.merge(dftmp,dftumor,on='GeneName',how='inner')
, , .
df.columns = ['GeneName','blood','tissue','tumor']
( ) .
df = df.set_index('GeneName')
df_norm = (df - df.mean()) / (df.max() - df.min())
df_norm.corr() . numpy , .
blood tissue tumor
blood 1.000000 0.395160 0.581629
tissue 0.395160 1.000000 0.840973
tumor 0.581629 0.840973 1.000000
HTH .
Student T, , numpy.log
import numpy as np
df[['blood','tissue','tumor']] = df[['blood','tissue','tumor']]+1
df_log = np.log(df[['blood','tissue','tumor']])
, df_log DataFrame.
df_log['logFCBloodTumor'] = df_log['blood'] - df_log['tumor']
df_log['logFCBloodTissue'] = df_log['blood'] - df_log['tissue']