Improving SQLite join performance

Check out the update at the bottom of this question, the reason for the unexpected variance in the queries noted below was identified as the result of quitk sqliteman

I have the following two tables in SQLite DB (the structure may seem pointless, I know, but I carry it with me)

+-----------------------+
| source                |
+-----------------------+
| item_id | time | data |
+-----------------------+

+----------------+
| target         |
+----------------+
| item_id | time |
+----------------+

--Both tables have a multi column index on item_id and time

The source table contains about 500,000 rows, there will not be more than one corresponding record in the target table, in practice, almost all source rows will have the corresponding target row.

I am trying to do a fairly standard anti-join to find all records in the source without matching lines in the target, but it is difficult for me to create a query with an acceptable runtime.

I am using the following query:

SELECT
    source.item_id,
    source.time,
    source.data
FROM source
LEFT JOIN target USING (item_id, time)
WHERE target.item_id IS NULL;

LEFT JOIN WHERE 200 , 5000 .

-, sqliteman.

- , , , , - , ?

. ( , )

SELECT 
    source.item_id,
    source.time,
    source.data
FROM source
WHERE NOT EXISTS (
    SELECT 1 FROM target
    WHERE target.item_id = source.item_id
    AND target.time = source.time
);

!

Update

, , sqliteman.

, sqliteman , 256, . , , .

, , SQLite?

+4
1

():

0|0|0|SCAN TABLE source
0|1|1|SEARCH TABLE target USING COVERING INDEX ti (item_id=? AND time=?)

:

  • source ,
  • target.

, . source, , , target . source , target , , .

SQLite source, , .. data:

> EXPLAIN QUERY PLAN
  SELECT source.item_id, source.time
  FROM source
  LEFT JOIN target USING (item_id, time)
  WHERE target.item_id IS NULL;
0|0|0|SCAN TABLE source USING COVERING INDEX si
0|1|1|SEARCH TABLE target USING COVERING INDEX ti (item_id=? AND time=?)

. , source, , , source rowid ( , ):

SELECT *
FROM source
WHERE rowid IN (SELECT source.rowid
                FROM source
                LEFT JOIN target USING (item_id, time)
                WHERE target.item_id IS NULL)
+1

Source: https://habr.com/ru/post/1543495/


All Articles