Saving data to call Python functions

I have a project in which I run several data through a specific function that has "cleans"them.

The cleanup function looks like this: Misc.py

def clean(my_data)
    sys.stdout.write("Cleaning genes...\n")

    synonyms = FileIO("raw_data/input_data", 3, header=False).openSynonyms()
    clean_genes = {}

    for g in data:
        if g in synonyms:
            # Found a data point which appears in the synonym list.
            #print synonyms[g]
            for synonym in synonyms[g]:
                if synonym in data:
                    del data[synonym]
                    clean_data[g] = synonym
                    sys.stdout.write("\t%s is also known as %s\n" % (g, clean_data[g]))
    return data

FileIO is a special class that I created to open files.

My question is that this function will be called many times during the program life cycle. What I want to achieve is not necessary to read input_data every time, as it will be the same every time. I know that I can simply return it and pass as an argument as follows:

def clean(my_data, synonyms = None) 
    if synonyms == None:
       ...
    else
       ...

But is there another, better looking way to do this?

My file structure is as follows:

lib
    Misc.py
    FileIO.py
    __init__.py
    ...
raw_data
runme.py

From runme.py, I do this from lib import *and call all the functions that I did.

Is there a pythonic way around this? Like a “memory” for a function

: : synonyms = FileIO("raw_data/input_data", 3, header=False).openSynonyms() a collections.OrderedDict() input_data .

:

column1    column2    key    data
  ...        ...      A      B|E|Z
  ...        ...      B      F|W
  ...        ...      C      G|P
  ...

:

OrderedDict([('A',['B','E','Z']), ('B',['F','W']), ('C',['G','P'])])

script, A B,E,Z. B F,W. ....

, . . .

+4
3

__call__. . , , . , , "" " ".

:

class Incrementer:
    def __init__ (self, increment):
        self.increment = increment

    def __call__ (self, number):
        return self.increment + number

incrementerBy1 = Incrementer (1)

incrementerBy2 = Incrementer (2)

print (incrementerBy1 (3))
print (incrementerBy2 (3))

:

4
5

[EDIT]

, @Tagc , , : "" .

Clean, DataCleaner, Clean. __call__, Clean.

+4

""

- .

, DataCleaner. , , . clean, :

class FileIO(object):
    def __init__(self, file_path, some_num, header):
        pass

    def openSynonyms(self):
        return []

class DataCleaner(object):
    def __init__(self, synonym_file):
        self.synonyms = FileIO(synonym_file, 3, header=False).openSynonyms()

    def clean(self, data):
        for g in data:
            if g in self.synonyms:
                # ...
                pass

if __name__ == '__main__':
    dataCleaner = DataCleaner('raw_data/input_file')
    dataCleaner.clean('some data here')
    dataCleaner.clean('some more data here')

, factory DataCleaner, ( ).

+3

I think the cleanest way to do this is to decorate your < clean" function (pun intended) with another function that provides a local function synonymsfor that function. It is an iamo cleaner and more concise than creating another custom class, but still allows you to easily modify the file "input_data" if you need (factory function):

def defineSynonyms(datafile):
    def wrap(func):
        def wrapped(*args, **kwargs):
            kwargs['synonyms'] = FileIO(datafile, 3, header=False).openSynonyms()
            return func(*args, **kwargs)
        return wrapped
    return wrap

@defineSynonyms("raw_data/input_data")
def clean(my_data, synonyms={}):
    # do stuff with synonyms and my_data...
    pass
+1
source

Source: https://habr.com/ru/post/1667674/


All Articles