Insoluble SQLite Performance

I have a manually created ORM on top of the Qt object system. I am testing it with the completion of SQLite, and I see strange performance issues. The database stores about 10 thousand objects. Objects are loaded one after another in separate requests.

One of the requests displays changes in the execution time: from 1 ms to 10, depending on the identifier of the primary key. This time also includes some of the operations performed by the Qt Sql module.

The request is very simple and looks like this (id = 100 differs between requests):

SELECT * FROM t1, t2 WHERE t1.id = 100 AND t2.id = 100 

What can make the same query run 10 times worse depending on the row id?

+4
source share
1 answer

Given that you are synchronizing in milliseconds, the behavior you observe makes a lot of sense. Benchmarking a single request is performed with this kind of grain of time, as a rule, it makes no sense, unless you are only interested in latency, and not bandwidth.

In your particular query, for example, you will see a significant difference depending on whether there are mathematical lines in t1 , as this will determine if SQLite should look at t2 at all.

Even executing exactly the same query will lead to different results depending on the OS file system cache, process scheduler, SQLite cache, the position of the hard drives and heads, and other factors.

Two more specific, there are two possibilities:

A. t1.id and t2.id indexed

This is the most likely case - I would expect the indexing of a column of a table labeled id .

Most SQL engines, including SQLite, use some kind of B-tree for each index. In SQLite, each node tree is one page in a DB file. With your specific SQLite query, you'll have to go through:

  • Some pages of t1.id index
  • Some pages of t2.id index
  • DB pages that contain the corresponding rows from both tables.

Depending on your hardware and how the pages are located on physical media (for example, on your hard drive), loading a page can easily add latency in a few milliseconds. This is especially noticeable in large or recently loaded databases, where these pages are neither in the OS file system cache nor in the SQLite3 cache.

In addition, if your database is not very small, it usually will not fit into the SQLite3 cache, and caching and cache misses can explain quite serious changes in time that a single query should fulfill: reading from the file system, which can easily lead to that the database process will be transferred to the OS in favor of another process.

B. t1.id and t2.id not indexed

This is probably easier to visualize: without indexes, SQLite should scan the entire table. Assuming you have a limit in your SELECT (you don’t have one in your example), will there be a matching record immediately or after going through the whole table to good luck, therefore, a major change in the completion of the query times.

+2
source

Source: https://habr.com/ru/post/1346956/


All Articles