There are literally hundreds of methods for this, depending on your needs. The rare Wikipedia matrix is a good start to understand a method that is specific to your needs.
As an extremely simple example, you can use the Dictionary of Keys class as follows:
class SparseDOK(dict): def __init__(self): pass def __setitem__(self,key,value): if value in[0,0.0,False,None]: dict.__setitem__(self,key,False) dict.__delitem__(self,key) else: dict.__setitem__(self,key,True) def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return False >>> dok=SparseDOK() >>> dok[10,20]=55 >>> print dok {(10, 20): True} >>> print dok[10,20] True >>> print dok[55,300] False >>> dok[10,20]=False >>> print dok[10,20] False
Each entry in an arbitrary βmatrixβ is considered False, unless True is set. You will need to add error checking, but it will be very compact and fast.
The advantage of constructing a key dictionary is a very efficient construction of the data structure. You only need to look at the source data once, and you can easily add or delete data. The disadvantage is less interactive processing of the matrix after its creation.
Since dictionary keys are tuples, it is trivial to add indexes by row or column. Since to complete this work, we need the entire matrix to build, we can simply build a dict with any sum or desired product once, and then turn to this type of processed data.
>>> dok[10,20]=True >>> dok[10,2000]=True >>> dok[11,2000]=True >>> dok[35000,2000]=True >>> dok[10,35000]=True >>> print dok {(11, 2000): True, (10, 2000): True, (35000, 2000): True, (10, 20): True, (10, 35000): True} cols={} for tup in dok.keys(): if tup[1] not in cols: cols[tup[1]]=1 else: cols[tup[1]]+=1 >>> print cols {2000: 3, 35000: 1, 20: 1}
Now you can refer to the col key in cols for the sum of the rows by col. It is trivial to add a product, etc. Just remember that you need to recalculate the amounts / products if the original DOK is edited or modified. You can save the total amount if you expect the DOK to change often after it is created.
If your needs are more complex, consider using SciPy or Pysparse . As you can see, SciPy has 7 different sparse matrix formats. Do not invent something that others have already done better ...