Our server (several Java applications on Debian) processes incoming data (GNSS observations), which should be:
- immediately (delay <200 ms) delivered to other applications,
- for future use.
Sometimes (several times a day) about a million archived records can be retrieved from the database. Record size - about 12 double-precision fields + timestamp and some identifiers. No UPDATE; REMOVALS are very rare but massive. Incoming stream up to one hundred records per second. Therefore, I had to choose a data storage mechanism.
I tried using MySQL (InnoDB). One application inserts, others constantly check the last record identifier and, if updated, retrieve new records. This part is working fine. But I ran into the following problems:
- Records are quite large (about 200-240 bytes per record).
- Retrieving millions of archived records is unacceptably slow (tens of minutes or more).
File storage will work very simply (since there are no inserts in the middle of the database, and the selection is mostly similar to "WHERE ID = 1 AND TIME BETWEEN 2000 AND 3000", but there are other problems:
- Finding new data may not be so easy.
- Other data, such as logs and configs, are stored in one database, and I prefer to have one database for everything.
Can you advise any suitable database engine (SQL is preferred but not needed)? Perhaps you can configure MySQL to reduce record size and timing for continuous data bands?
MongoDB is unacceptable because the size of the database is limited on 32-bit machines. Any engine that does not provide quick access to recently inserted data is also unacceptable.
source share