I have a project in which I run several data through a specific function that has "cleans"them.
The cleanup function looks like this: Misc.py
def clean(my_data)
sys.stdout.write("Cleaning genes...\n")
synonyms = FileIO("raw_data/input_data", 3, header=False).openSynonyms()
clean_genes = {}
for g in data:
if g in synonyms:
for synonym in synonyms[g]:
if synonym in data:
del data[synonym]
clean_data[g] = synonym
sys.stdout.write("\t%s is also known as %s\n" % (g, clean_data[g]))
return data
FileIO is a special class that I created to open files.
My question is that this function will be called many times during the program life cycle. What I want to achieve is not necessary to read input_data every time, as it will be the same every time. I know that I can simply return it and pass as an argument as follows:
def clean(my_data, synonyms = None)
if synonyms == None:
...
else
...
But is there another, better looking way to do this?
My file structure is as follows:
lib
Misc.py
FileIO.py
__init__.py
...
raw_data
runme.py
From runme.py, I do this from lib import *and call all the functions that I did.
Is there a pythonic way around this? Like a “memory” for a function
:
: synonyms = FileIO("raw_data/input_data", 3, header=False).openSynonyms() a collections.OrderedDict() input_data .
:
column1 column2 key data
... ... A B|E|Z
... ... B F|W
... ... C G|P
...
:
OrderedDict([('A',['B','E','Z']), ('B',['F','W']), ('C',['G','P'])])
script, A B,E,Z. B F,W. ....
, . . .