What is an easy way to create a memory-limited memoization system?

I am writing a manual memoization computation system (pah, in Matlab). The simple parts are simple:

  • A way to put data into the memoization system after doing the calculation.
  • A way to request and receive data from memoization.
  • The system request method for all the "keys".

These parts are not so much in doubt. The problem is that my computer has a limited amount of memory, so sometimes the put operation should unload some objects from memory. I'm worried about "cache misses," so I need some relatively simple system to remove memoized objects that are not used often and / or are not expensive to re-compromise. How do I design this system? The parts that I can imagine have:

  • A way of saying the put operation is an expensive (relatively speaking) calculation.
  • The method does not necessarily hint at the "put" operation, as often calculation may be required (hereinafter).
  • The get operation would like to note how often a given object is set.
  • A way to tell the whole system the maximum amount of memory used (normal, it's simple).

The real guts of this will be in the "put" operation, when you press the memory limit, and he will have to select some objects based on their memory, their cost and their usefulness. How to do it?

Sorry if this is too vague or off topic.

+4
source share
2 answers

I would do this by creating a subclass of DYNAMICPROPS that uses an array of cells to store data inside. Thus, you can dynamically add additional data to the object.

Here is the basic design idea:

Data is stored in an array of cells. Each property gets its own row, with the first column being the name of the property (for convenience), the second column is the function descriptor for calculating the data, the third column is the data, the fourth column is the time it takes to generate the data, and the fifth column is an array, say, of length 100, which stores timestamps corresponding to when the resource was accessed the last 100 times, and the sixth column contains a variable size.

There is a general get method that takes as input the string corresponding to the property (see below). The get method first checks to see if column 3 is empty. If not, it returns a value and saves a timestamp. If so, it does the calculation using the descriptor from column 1 inside the TIC / TOC expression to measure how expensive the calculation is (which is stored in col4 and the timestamp is stored in col5). He then checks to see if there is enough space to store the result. If so, then it stores the data, otherwise it checks the size, as well as the product, of how many times the data was available, how long it will take to recover to decide what to select.

In addition, there is a 'add' property that adds a string to an array of cells, creates a dynamic property (using addprops ) with the same name as the function handle, and sets the get-method to myGetMethod(myPropertyIndex) . If you need to pass parameters to a function, you can create an additional property, myDynamicPropertyName_parameters , using the set method, which will delete previously calculated data whenever the parameter values โ€‹โ€‹change.

Finally, you can add several dependent properties that can determine how many properties exist (the number of rows in the array of cells), how they are called (the first column of the array of cells), etc.

+2
source

Consider using Java because MATLAB runs on top of it and can access it. This will work if you have marching values โ€‹โ€‹(ints, doubles, string, matrices, not structs or cell arrays).

LRU containers are available for Java:

+1
source

Source: https://habr.com/ru/post/1340104/


All Articles