I am writing some software that should smooth data from a hierarchical format type to a table format. Instead of doing all this in a programming language every time and serving it, I want to cache the results for a few seconds and use SQL to sort and filter. When we use, we speak 400,000 records and 1 or 2 readings during these few seconds.
Each table will contain from 3 to 15 columns. Each line will contain from 100 bytes to 2,000 bytes of data, although it is possible that in some cases some lines can receive up to 15,000 bytes. I can freeze the data, if necessary, in order to keep working.
The main parameters that I consider:
MySQL memory engine
A good option, almost specially written for my use! But ... "MEMORY tables use a fixed-length row storage format. Variable-length types such as VARCHAR are stored with a fixed length. MEMORY tables cannot contain BLOB or TEXT columns." - Unfortunately, I have text fields up to 10,000 characters long - and even this is a number that is not specifically limited. I could adjust the varchar length based on the maximum length of the text columns when I go through my anti-aliasing, but this is not entirely elegant. Also, for my random row of 15,000 characters, does this mean that I need to allocate 15,000 characters for each row in the database? If there were 100,000 lines, then this is 1.3 gb, not including overhead!
InnoDB on RAMDisk
This is designed to work in the cloud, and I could easily deploy a server with 16 GB of RAM, configure MySQL to write to tmpfs, and use full-featured MySQL. My concern for this is space. Although I'm sure the engineers wrote a memory mechanism to prevent the consumption of all temporary storage and server crashes, I doubt that this solution will know when to stop. How much actual space will my 2000 bytes of data in database format consume? How can I control this?
Bonus questions
Indexes I know in advance which columns need to be filtered and sorted. I can tune the index before inserting it, but what kind of performance improvement could I honestly expect from a disk with a disk? How much extra index overhead do you add?
Insertion I assume that inserting multiple rows with a single query is faster. But one request or a series of large requests is stored in memory, and we write to memory, so if I did this, for a moment I would need to double the memory. So, we are talking about doing one or two or a hundred at a time, and waiting for it to complete before processing more. InnoDB does not lock the table, but I'm worried that you are sending two requests too close together and confusing MySQL. Is this a serious problem? With the MEMORY engine, I would definitely have to wait for completion due to table locks.
Temporary Are there any benefits to temporary tables besides the fact that they are deleted when the db connection is closed?