I have the following data structure (with sample data):
edgeID (unique key) | timeStep (ordering key, | value
| can have multiple occurrences) |
-----------------------------------------------------------------
"edge1" | 15 | 12.1
"edge3" | 18 | 17.32
"edge2" | 23 | 15.1
"edge5" | 23 | 65.6
I want to be able to effectively perform the following tasks in this structure:
- Add a new data entry with
timeStep
higher than any other saved timeStep
. If maxNumber
data records are reached (e.g. 20), the data record with the smallest timeStep
should be deleted. - Combine the two datasets, storing the
maxNumber
data records (for example, 20) of the highest timeStemp
records, while retaining each edgeID
no more than once (in case two records for the same edge should use the highest timeStep
).
How to implement this data structure in python?
I tried one approach that works:
, , SortedSet, :
data = {}
dataOrder = SortedSet(key=lambda x: data[x][0])
maxDataSize = 20
def addData(edgeID, dataTuple):
if(len(data) >= maxDataSize):
# remove oldest value
key = dataOrder.pop(0)
del data[key]
# add
data[edgeID] = dataTuple
dataOrder.add(edgeID)
addData("edge1", (15, 12.1))
, edgeID
.
, :
SortedSet, :
data = SortedSet(key=lambda x: x[1])
maxDataSize = 20
def addData(dataTuple):
if(len(self.data) >= self.maxDataSize):
data.pop(0)
data.add(dataTuple)
addData(("edge1", 15, 12.1))
, , , edgeID
timeSteps
, ( ) , edgeID
. , - OrderedSet
. , , , :
, , , __hash__()
, edgeID
. OrderedSet
? ?