Row counting in sqlite db

I have sqlite db on embedded ARM platform with Linux with limited resources. Storage device - microSD card. Sqlite version is 3.7.7.1. The application that accesses sqlite is written in C ++.

I want to know the number of rows in several tables evenly. I am currently using

select count(*) from TABLENAME; 

to get this information. I'm having performance issues: when the table sizes reach a certain point (~ 200 thousand rows), I have many system and iowait loads every time I check the table sizes.

When I wrote this, I, although finding the number of rows in a table would be quick, since it is probably stored somewhere. But now I suspect that sqlite is actually looking at all the rows, and when I pass in the point where the data no longer fits in the disk cache, I get a lot of io-loading. This roughly corresponds to db size and available memory.

Can someone tell me if sqlite works as I suspect?

Is there a way to get the number of rows of a table without creating this amount of load?

EDIT : plaes asked a question about the location of the table:

 CREATE TABLE %s (timestamp INTEGER PRIMARY KEY, offset INTEGER, value NUMERIC); 
+4
source share
3 answers

From all the information that I have collected, count () apparently really needs to scan the table. As plaes pointed out, this happens faster if the count is done on an integer indexed column, but index scanning is still necessary.

What I am doing now is storing the line counter somewhere and incrementing / decrementing it manually in the same transactions that I use for insertion and deletion to keep it consistent.

+1
source

Does this table have an integer index? If not, add one. Otherwise, it must scan the entire table to count the elements.

This is a snippet of comments from SQLite code that implements parsing and executing COUNT() :

  /* If isSimpleCount() returns a pointer to a Table structure, then ** the SQL statement is of the form: ** ** SELECT count(*) FROM <tbl> ** ** where the Table structure returned represents table <tbl>. ** ** This statement is so common that it is optimized specially. The ** OP_Count instruction is executed either on the intkey table that ** contains the data for table <tbl> or on one of its indexes. It ** is better to execute the op on an index, as indexes are almost ** always spread across less pages than their corresponding tables. */ [...] /* Search for the index that has the least amount of columns. If ** there is such an index, and it has less columns than the table ** does, then we can assume that it consumes less space on disk and ** will therefore be cheaper to scan to determine the query result. ** In this case set iRoot to the root page number of the index b-tree ** and pKeyInfo to the KeyInfo structure required to navigate the ** index. ** ** (2011-04-15) Do not do a full scan of an unordered index. 

In addition, you can get more information about your query with the EXPLAIN QUERY PLAN .

+2
source

Below are 2 possible ways to bypass the row table (with reservations) that do not cause the table / index to be scanned:

  • Note for tables in which you can use INTEGER PRIMARY KEY AUTOINCREMENT as the primary key, you can grab an account from sqlite_sequence sqlite metadata:

    select name, seq from sqlite_sequence

seq will contain either the last id or the next id (I think the last, but not sure).

  1. "select max (pkid) from the table", which is likely to search by index instead of scanning (and will also be accurate only for tables without deletion).

Knowing this, if your use case includes UNIQUE deletions for tables that you can use AUTOINCREMENT on, you can make a trigger-based hybrid solution and only count deleted rows (which would probably be less bookkeeping than counting inserts for most scenarios). However, if you insert and delete the same row twice, this will not work either.

0
source

Source: https://habr.com/ru/post/1382874/


All Articles