A query using subqueries in an SQLite database is about 10 times slower using newer versions of System.Data.SQLite / sqlite3.dll

(See update below)

I have a problem with slow query performance when requesting very simplified Sqlite data containing about 500,000 lines from a C # .Net application (~ 5 seconds).

I tried the exact same query in the same database using LinqPad, as well as two database browsers (both based on QtSql), and it works 10 times faster (~ 0.5 sec.). The same request, the same db, different applications, only mine does not work fast.

This makes a minor difference: whether it returns values ​​or just Count (*).

I tried:

  • for each of .Net 3.5 / 4 / 4.5
  • for each of AnyCPU / x86 / x64
  • using each of System.Data.Sqlite, sqlite-net, as well as direct access to sqlite3 dll via COM
  • for each of WPF / WinForms
  • various query options

None of them make any noticeable difference in request time.

I know that rewriting a query using JOIN can help, but what I cannot understand is that the same query works fine in LinqPad / Sql browsers, but not from any application that I am trying to create. I have to miss something quite fundamental.

Example table:

"CREATE TABLE items(id INTEGER PRIMARY KEY, id1 INTEGER, id2 INTEGER, value INTEGER)" 

Example query string (although basically any query using a subquery takes a lot of time):

 SELECT count(*) FROM items WHERE id2 IN ( SELECT DISTINCT id2 FROM items WHERE id1 IN ( SELECT DISTINCT id1 FROM items WHERE id2 = 100000 AND value = 10 ) AND value = 10 ) AND value = 10 GROUP BY id2 

I know that this can probably be rewritten using JOINS and indexing to speed it up, but the fact remains: this query works much faster from other applications. What am I missing here, why is the same query running much slower than I'm trying?

UPDATE: The sqlite version seems to have something to do with the problem. Using the deprecated System.Data.Sqlite v1.0.66.0, the query is executed in the same way as other applications, but the use of a later version is slow. I did not determine which version it was changed in, but I am sure that this is due to the basic version of sqlite3, and not to System.Data.Sqlite. If someone knows what could change, which will cause the subqueries to slow down so much in this situation, or if there are any settings or something that can make the subqueries work faster in newer versions of sqlite, please let me know!

Again, the request is an example and is not ideal and partially redundant ... the question is why it works in one and not the other.

Thanks in advance for any additional input!

UPDATE: SOLVED

See my answer below.

0
source share
3 answers

Ok finds out that this is due to the automatic indexing that was introduced with SQLite 1.7.0. In my situation, using subqueries in this table without indexes meant that the time taken by SQLite to create automatic indexes caused additional overhead that occurred in the queries.

The solution was to use:

 PRAGMA automatic_index=OFF; 

at the beginning of any query that uses the "IN" clause.

Creating indexes in columns can also solve this problem (unchecked), however, in this particular situation, the additional use of size / disk needed to create indexes is not worth it.

This also suggests that the LinqPad SQLite plugin and the database viewers I used are based on older versions of sqlite.

More information can be found at:

http://www.sqlite.org/src/info/8011086c85c6c4040

http://www.sqlite.org/optoverview.html#autoindex

Thanks to all who responded.

0
source

Some suggestions:

You say you don’t want to process your queries and not add indexes. This is the obvious thing here. Without any indexes, sqlite should scan 500,000 rows of the table at least once (or most likely several times).

Based on your query above, I would add indexes to id1 and id2 .

Another thing is that your request above seems a little redundant. You may have your own reasons, but I don’t understand why the request should be so complicated. Simplified request:

 select count(*) from items where id2 = 100000 and value = 10 

Strike>

0
source

try

 SELECT ID1.id2, count(*) FROM items ID2 JOIN items ID1 on ID2.id2 = ID1.id1 and ID1.id2 = 100000 and ID1.value = 10 and ID2.valu3 = 10 group by ID1.id2 
0
source

Source: https://habr.com/ru/post/1237319/


All Articles