Python memory table

What is the correct way to create an in-memory table in python with a direct search for rows and columns.
I was thinking about using dict of dicts in this way

class Table(dict): def __getitem__(self, key): if key not in self: self[key]={} return dict.__getitem__(self, key) table = Table() table['row1']['column1'] = 'value11' table['row1']['column2'] = 'value12' table['row2']['column1'] = 'value21' table['row2']['column2'] = 'value22' >>>table {'row1':{'column1':'value11','column2':'value12'},'row2':{'column1':'value21','column2':'value22'}} 

It was hard for me to look up the values ​​in the columns.

 >>>'row1' in table True >>>'value11' in table['row1'].values() True 

Now how do I search if 'column1' has 'value11'
Is this method of generating tables wrong?
Is there a better way to implement such tables with an easier search?

+4
source share
4 answers

Now how do I search if 'column1' has the value "11"

any(arow['column1'] == 'value11' for arow in table.iteritems())

Is this method of generating tables wrong?

No, it’s just very β€œexposed”, perhaps too much - it could be encapsulated in a class that provides the methods you need, then the question of how to best implement them does not affect the rest of your application.

Is there a better way to implement such tables with an easier search?

Once you have developed a class whose interface you would like to use, you can experiment with very different implementation approaches and compare them with the workload representing your usage pattern so that you can find out what is best for you (provided that table processing and searching are a large part of the runtime of your application, of course - to find out the profile of your application).

I had similar, but not identical needs for the large internal application that I support at work, except that the row indices are integer (only the column names are strings), the order of the columns is important, and the workload is more about editing table (add, delete, reorder rows or columns, rename columns, etc.). I started with a table that reveals the functionality I need, with a simple rough and ready implementation inside (a list of dicts, as well as a list of column names to organize the columns); and so far I have developed it (regardless of the actual parts of the "application level", but based on profiling and comparative analysis) for completely different implementations (currently based on numpy ).

I think that you should continue similar lines: "dress" your current implementation in a nice "interface" with all the necessary methods, profile the application - if this table object is not a performance bottleneck, everything is ready; if this is a bottleneck, you can optimize the implementation (experiment, measurement, repeat ;-) without violating any of the other applications.

Inheriting from dict not a good idea, because you probably don't want to reveal all the functionality of dict rich; plus what are you doing, roughly speaking, an inefficient implementation of collections.defaultdict(dict) . So, encapsulate the latter:

 import collections class Table(object): def __init__(self): self.d = collections.defaultdict(dict) def add(self, row, col, val): self.d[row][col] = val def get(self, row, col, default=None): return self.d[row].get(col, default) def inrow(self, row, col): return col in self.d[row] def incol(self, col, val): return any(x[col]==val for x in self.d.iteritems()) 

etc. etc. - write all the methods that your application needs with useful short names, and then maybe see if you can use some of them as special methods if they are often used in this way, for example, maybe (assuming Python 2. * - requires a slightly different syntax in 3. *):

  def __setitem__(self, (row, col), val): self.add(row, col, val) 

etc. After you work with the code, the time comes for profiling, benchmarking and, possibly, internal optimization of the implementation.

+7
source

I would use a database in memory with SQLite . The sqlite module is even included in the standard library with Python 2.5, which means that it does not even add your requirements.

+7
source

The nested list should be able to do this work here. I would use only nested dictionaries if the elements are thinly distributed over the grid.

 grid = [] for row in height: grid.append([]) for cell in width: grid[-1].append(value) 

Checking the strings is simple:

 def valueInRow(value, row): return value in grid[row] 

Checking the columns requires a bit more work because the grid is a list of rows, not a list of columns:

 def collumnIterator(collumn): height = len(grid) for row in xrange(height): yield grid[row][collumn] def valueInCollumn(value, collumn): return value in collumnIterator(collumn) 
0
source

Now how do I search if "column1" has the value "value11"

Are you asking about this?

 found= False for r in table: if table[r]['column1'] == 'value11' found= True break 

Is this what you are trying to do?

0
source

Source: https://habr.com/ru/post/1305827/


All Articles