My program, written in Python 3, has many places where it starts with a (very large) tabular structure of numeric data and adds columns to it according to a specific algorithm. (The algorithm is different in all places.)
I am trying to transform this into a pure functional approach, as I am having problems with the imperative approach (it’s hard to reuse it, it’s hard to remember intermediate steps, it’s hard to achieve “lazy” calculations that are prone to errors due to state dependency, etc.).
The Table class is implemented as a dictionary of dictionaries: an external dictionary contains rows indexed by row_id ; the inside contains the values inside the row indexed by column_title . The table methods are very simple:
Until now, I just added columns to the source table, and each function took the whole table as an argument. When I move on to pure functions, I will have to make all the arguments unchanged. So, the initial table becomes unchanged. Any additional columns will be created as stand-alone columns and passed only to the functions that need them. A typical function will take an initial table and several columns that are already created, and return a new column.
The problem I am facing is how to implement a separate column ( Column )?
I could make each of them a dictionary, but it seems very expensive. In fact, if I ever need to perform an operation, say, 10 fields in each logical row, I will need to perform 10 searches in the dictionary. And, in addition, each column will contain both a key and a value, doubling its size.
I could make Column simple list and store in it a reference to the mapping from row_id to the index of the array. The advantage is that this collation can be split between all columns that correspond to the same initial table, and also once when it is scanned once, it works for all columns. But does this create any other problems?
If I do this, can I go ahead and actually save the display inside the source table itself? Can I put links from Column objects back to the source table from which they were created? It seems very different from how I imagined a functional approach to work, but I don’t see what problems it can cause, since everything is immutable.
Does the functional approach generally frown while keeping the link in the return value to one of the arguments? It doesn't seem to break anything (e.g. optimization or lazy evaluation), since the argument was already known anyway. But maybe I missed something.