Slow MySQL queries in Python, but fast elsewhere

I have heckuva time dealing with slow MySQL queries in Python. In one area of ​​my application, "load data infile" is fast. In another area, select queries are very slow.

Running the same request in PhpMyAdmin AND Navicat (like the second test) gives the answer ~ 5 times faster than in Python.

A few notes ...

  • I switched to MySQLdb as a connector, and also used SSCursor. No performance increase.
  • The database is optimized, indexed, etc. I port this Python application from PHP / Codeigniter, where it worked perfectly (I foolishly thought that exiting PHP would help speed it up).
  • PHP / Codeigniter executes select queries quickly. For example, one key aspect of an application takes ~ 2 seconds in PHP / Codeigniter, but takes 10 seconds in Python before any data analysis is performed.

My database link is pretty standard ...

dbconn=MySQLdb.connect(host="127.0.0.1",user="*",passwd="*",db="*", cursorclass = MySQLdb.cursors.SSCursor) 

Any ideas / help / advice would be greatly appreciated!

UPDATE

In terms of getting / processing the results, I tried this in several ways. The original request is pretty standard ...

 # Run Query cursor.execute(query) 

I deleted all the code in this loop just to make sure that it is not like a bottle, and it is not. I put dummy code instead. The whole process was not accelerated at all.

 db_results = "test" # Loop Results for row in cursor: a = 0 (this was the dummy code I put in to test) return db_results 

The query result itself - only 501 rows (a large number of columns) ... took 0.029 seconds outside of Python. Taking significantly longer than in Python.

The project is connected with horse racing. The request is executed inside this function. The request itself is long, but it works great outside of Python. I commented on the code in a loop specifically for testing ... also print (request) in the hope of understanding this.

 # Get PPs def get_pps(race_ids): # Comma Race List race_list = ','.join(map(str, race_ids)) # PPs Query query = ("SELECT raceindex.race_id, entries.entry_id, entries.prognum, runlines.line_id, runlines.track_code, runlines.race_date, runlines.race_number, runlines.horse_name, runlines.line_date, runlines.line_track, runlines.line_race, runlines.surface, runlines.distance, runlines.starters, runlines.race_grade, runlines.post_position, runlines.c1pos, runlines.c1posn, runlines.c1len, runlines.c2pos, runlines.c2posn, runlines.c2len, runlines.c3pos, runlines.c3posn, runlines.c3len, runlines.c4pos, runlines.c4posn, runlines.c4len, runlines.c5pos, runlines.c5posn, runlines.c5len, runlines.finpos, runlines.finposn, runlines.finlen, runlines.dq, runlines.dh, runlines.dqplace, runlines.beyer, runlines.weight, runlines.comment, runlines.long_comment, runlines.odds, runlines.odds_position, runlines.entries, runlines.track_variant, runlines.speed_rating, runlines.sealed_track, runlines.frac1, runlines.frac2, runlines.frac3, runlines.frac4, runlines.frac5, runlines.frac6, runlines.final_time, charts.raceshape " "FROM hrdb_raceindex raceindex " "INNER JOIN hrdb_runlines runlines ON runlines.race_date = raceindex.race_date AND runlines.track_code = raceindex.track_code AND runlines.race_number = raceindex.race_number " "INNER JOIN hrdb_entries entries ON entries.race_date=runlines.race_date AND entries.track_code=runlines.track_code AND entries.race_number=runlines.race_number AND entries.horse_name=runlines.horse_name " "LEFT JOIN hrdb_charts charts ON runlines.line_date = charts.race_date AND runlines.line_track = charts.track_code AND runlines.line_race = charts.race_number " "WHERE raceindex.race_id IN (" + race_list + ") " "ORDER BY runlines.line_date DESC;") print(query) # Run Query cursor.execute(query) # Query Fields fields = [i[0] for i in cursor.description] # PPs List pps = [] # Loop Results for row in cursor: a = 0 #this_pp = {} #for i, value in enumerate(row): # this_pp[fields[i]] = value #pps.append(this_pp) return pps 

One final note ... I did not consider the ideal way to handle the result. I believe that one cursor allows you to return the result as a set of dictionaries. I didn’t even get to this point, since the request and return of myself are so slow.

+4
source share
2 answers

You have only 501 rows, it looks like you have more than 50 columns. How much data is transferred from MySQL to Python?

501 rows x 55 columns = 27 555. The returned cells.

If each cell averaged "only" 1K, which would be close to 27 MB of returned data.

To understand how much data mysql clicks, you can add this to your query:

 SHOW SESSION STATUS LIKE "bytes_sent" 

Is your server well secured? Is memory allocation well configured?

I assume that when you use PHPMyAdmin, you get paginated results. This will mask the MySQL problem returning more data than your server can handle (I'm not using Navicat, not sure how this returns the results).

Perhaps the Python process is limited by memory operation, and faced with this large set of results, it should put the page on disk to process the result set.

If you reduce the number of columns called and / or limited, say LIMIT 10 at your request, do you get improved speed?

You can see if the paging server on disk is running on Python when this request is called? You can see what memory is allocated for Python, how much is used during the process, and how is this allocation and usage compared with the same values ​​in the PHP version?

Can I allocate more memory for your limited resource?

Can you reduce the number of columns or rows caused by pagination or asynchronous loading?

+2
source

I know this is late, however I ran into similar issues with mysql and python. My solution is to use queries using a different language ... I use R to do my queries, which are blindly fast, do what I can in R, and then send data to python if necessary for more general programming, although R has a lot in common as well as libraries. I just wanted to post something that can help someone who has a similar problem, and I know that this side poses the essence of the problem.

+1
source

Source: https://habr.com/ru/post/1491881/


All Articles