Python slow on fetchone, hangs on fetchall

I am writing a script to SELECT query a database and parse through ~ 33,000 records. Unfortunately, I am encountering problems at the cursor.fetchone() / cursor.fetchall() stage.

At first I tried to iterate over the record for such a time:

 # Run through every record, extract the kanji, then query for FK and weight printStatus("Starting weight calculations") while True: # Get the next row in the cursor row = cursor.fetchone() if row == None: break # TODO: Determine if there any kanji in row[2] weight = float((row[3] + row[4]))/2 printStatus("Weight: " + str(weight)) 

Based on the output of printStatus (it prints the timestamp plus any line passed to it), the script took about 1 second to process each line. This led me to believe that the query was re-run every time the cycle was repeated (using LIMIT 1 or something else), since it took ~ 1 second for the same query to be run once in something like SQLiteStudio [i] and [/ i] returns all 33,000 rows. I calculated that at this rate it would take about 7 hours to go through all 33,000 entries.

Instead of sitting, I tried using cursor.fetchall () instead:

 results = cursor.fetchall() # Run through every record, extract the kanji, then query for FK and weight printStatus("Starting weight calculations") for row in results: # TODO: Determine if there any kanji in row[2] weight = float((row[3] + row[4]))/2 printStatus("Weight: " + str(weight)) 

Unfortunately, the Python executable is locked with 25% processor and ~ 6 MB of RAM when it hit the cursor.fetchall() . I left the script for ~ 10 minutes, but nothing happened.

Is the 33,000 rows returned (about 5 MB of data) too much for Python to immediately grab? Am I stuck repeating one at a time? Or can I do something to speed up the process?

EDIT: here console output

 12:56:26.019: Adding new column 'weight' and related index to r_ele 12:56:26.019: Querying database 12:56:28.079: Starting weight calculations 12:56:28.079: Weight: 1.0 12:56:28.079: Weight: 0.5 12:56:28.080: Weight: 0.5 12:56:28.338: Weight: 1.0 12:56:28.339: Weight: 3.0 12:56:28.843: Weight: 1.5 12:56:28.844: Weight: 1.0 12:56:28.844: Weight: 0.5 12:56:28.844: Weight: 0.5 12:56:28.845: Weight: 0.5 12:56:29.351: Weight: 0.5 12:56:29.855: Weight: 0.5 12:56:29.856: Weight: 1.0 12:56:30.371: Weight: 0.5 12:56:30.885: Weight: 0.5 12:56:31.146: Weight: 0.5 12:56:31.650: Weight: 1.0 12:56:32.432: Weight: 0.5 12:56:32.951: Weight: 0.5 12:56:32.951: Weight: 0.5 12:56:32.952: Weight: 1.0 12:56:33.454: Weight: 0.5 12:56:33.455: Weight: 0.5 12:56:33.455: Weight: 1.0 12:56:33.716: Weight: 0.5 12:56:33.716: Weight: 1.0 

And here is the SQL query:

 //...snip (it wasn't the culprit)... 

EXPLAIN QUERY PLAN output from SQLiteStudio:

 0 0 0 SCAN TABLE r_ele AS re USING COVERING INDEX r_ele_fk (~500000 rows) 0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 1 1 0 0 SEARCH TABLE re_pri USING INDEX re_pri_fk (fk=?) (~10 rows) 0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 2 2 0 0 SEARCH TABLE ke_pri USING INDEX ke_pri_fk (fk=?) (~10 rows) 2 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 3 3 0 0 SEARCH TABLE k_ele USING AUTOMATIC COVERING INDEX (value=?) (~7 rows) 3 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 4 4 0 0 SEARCH TABLE k_ele USING COVERING INDEX idx_k_ele (fk=?) (~10 rows) 0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 5 5 0 0 SEARCH TABLE k_ele USING COVERING INDEX idx_k_ele (fk=?) (~10 rows) 0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 6 6 0 0 SEARCH TABLE re_pri USING INDEX re_pri_fk (fk=?) (~10 rows) 0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 7 7 0 0 SEARCH TABLE ke_pri USING INDEX ke_pri_fk (fk=?) (~10 rows) 7 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 8 8 0 0 SEARCH TABLE k_ele USING AUTOMATIC COVERING INDEX (value=?) (~7 rows) 8 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 9 9 0 0 SEARCH TABLE k_ele USING COVERING INDEX idx_k_ele (fk=?) (~10 rows) 
+6
source share
1 answer

SQLite automatically calculates the resulting records. fetchone slow because it has to execute all subqueries for every entry in r_ele . fetchall even slower because it takes as much time as if you performed fetchone for all records.

SQLite 3.7.13 estimates that all search queries in the value column would be terribly slow and therefore create a temporary index for this query. You must create a persistent index so that it can be used by SQLite 3.6.21:

 CREATE INDEX idx_k_ele_value ON k_ele(value); 

If this does not help, upgrade Python with a newer version of SQLite or use a different database library with a newer version of SQLite, such as APSW .

+3
source

Source: https://habr.com/ru/post/952149/


All Articles