I have heckuva time dealing with slow MySQL queries in Python. In one area of my application, "load data infile" is fast. In another area, select queries are very slow.
Running the same request in PhpMyAdmin AND Navicat (like the second test) gives the answer ~ 5 times faster than in Python.
A few notes ...
- I switched to MySQLdb as a connector, and also used SSCursor. No performance increase.
- The database is optimized, indexed, etc. I port this Python application from PHP / Codeigniter, where it worked perfectly (I foolishly thought that exiting PHP would help speed it up).
- PHP / Codeigniter executes select queries quickly. For example, one key aspect of an application takes ~ 2 seconds in PHP / Codeigniter, but takes 10 seconds in Python before any data analysis is performed.
My database link is pretty standard ...
dbconn=MySQLdb.connect(host="127.0.0.1",user="*",passwd="*",db="*", cursorclass = MySQLdb.cursors.SSCursor)
Any ideas / help / advice would be greatly appreciated!
UPDATE
In terms of getting / processing the results, I tried this in several ways. The original request is pretty standard ...
I deleted all the code in this loop just to make sure that it is not like a bottle, and it is not. I put dummy code instead. The whole process was not accelerated at all.
db_results = "test"
The query result itself - only 501 rows (a large number of columns) ... took 0.029 seconds outside of Python. Taking significantly longer than in Python.
The project is connected with horse racing. The request is executed inside this function. The request itself is long, but it works great outside of Python. I commented on the code in a loop specifically for testing ... also print (request) in the hope of understanding this.
# Get PPs def get_pps(race_ids): # Comma Race List race_list = ','.join(map(str, race_ids)) # PPs Query query = ("SELECT raceindex.race_id, entries.entry_id, entries.prognum, runlines.line_id, runlines.track_code, runlines.race_date, runlines.race_number, runlines.horse_name, runlines.line_date, runlines.line_track, runlines.line_race, runlines.surface, runlines.distance, runlines.starters, runlines.race_grade, runlines.post_position, runlines.c1pos, runlines.c1posn, runlines.c1len, runlines.c2pos, runlines.c2posn, runlines.c2len, runlines.c3pos, runlines.c3posn, runlines.c3len, runlines.c4pos, runlines.c4posn, runlines.c4len, runlines.c5pos, runlines.c5posn, runlines.c5len, runlines.finpos, runlines.finposn, runlines.finlen, runlines.dq, runlines.dh, runlines.dqplace, runlines.beyer, runlines.weight, runlines.comment, runlines.long_comment, runlines.odds, runlines.odds_position, runlines.entries, runlines.track_variant, runlines.speed_rating, runlines.sealed_track, runlines.frac1, runlines.frac2, runlines.frac3, runlines.frac4, runlines.frac5, runlines.frac6, runlines.final_time, charts.raceshape " "FROM hrdb_raceindex raceindex " "INNER JOIN hrdb_runlines runlines ON runlines.race_date = raceindex.race_date AND runlines.track_code = raceindex.track_code AND runlines.race_number = raceindex.race_number " "INNER JOIN hrdb_entries entries ON entries.race_date=runlines.race_date AND entries.track_code=runlines.track_code AND entries.race_number=runlines.race_number AND entries.horse_name=runlines.horse_name " "LEFT JOIN hrdb_charts charts ON runlines.line_date = charts.race_date AND runlines.line_track = charts.track_code AND runlines.line_race = charts.race_number " "WHERE raceindex.race_id IN (" + race_list + ") " "ORDER BY runlines.line_date DESC;") print(query) # Run Query cursor.execute(query) # Query Fields fields = [i[0] for i in cursor.description] # PPs List pps = [] # Loop Results for row in cursor: a = 0 #this_pp = {} #for i, value in enumerate(row): # this_pp[fields[i]] = value #pps.append(this_pp) return pps
One final note ... I did not consider the ideal way to handle the result. I believe that one cursor allows you to return the result as a set of dictionaries. I didn’t even get to this point, since the request and return of myself are so slow.