Running Python is very slow on a Linux server ... but fast on a Linux-developed VM?

Using the same code, it takes about 50 ms each call to the registration method (for example, logger.debug) on ​​the server, and on the dev machine, less than 1 ms. Logs are output to files with a bit of formatting.

In addition to the slowdowns during registration, the server is twice as fast.

I am developing Ubuntu 11.04 (Gnome), running inside VMWare on Windows 7. The server is running Ubuntu Server 11.04 (no GUI, clean console). The logging module is the official "logging" module ("import logging ... logger = logging.getLogger (" mylogger ")").

Any idea what could be causing this? This is very frustrating!

Thanks for any help!

EDIT: Both machines return "Python 2.7.1+" for their version. Both machines are running 64-bit Ubuntu.

The server’s hard drive configuration is software RAID-1, and on the developer's computer there’s only one drive.

EDIT2: Accepted Fabian's answer as it was thorough, although this did not solve the problem.

Decision. Recording to the console, period, very slow. I tested writing X to a file and writing X to the console, and it was about 100 times slower for the console. I do not know why this would be, but I just ran the fact that I was working with ssh from another computer, and everything was solved.

+6
source share
1 answer

As noted in the comments, a possible reason is the disc speed difference between the development VM and the production machine. You have the same drives in both systems, for example. SSD, SATA and SCSI, spindle speed and cache, etc.? Your environment is different from IO. Desktop Windows and VMWare will use aggressive disk caches, while your production Linux machine is most likely to make a security error and wait for data to be mapped to disk more often. Maybe there are drivers on the Windows machine that are better suited for the drive that it has, and the server works without optimizations? The differences in the file system are also significant, and the hardware is probably quite different to cause significant differences in I / O speed. You can also have big differences in processor speed and RAM. Desktop computers are now often more focused on raw speed, while server hardware will be more focused on reliability. You know that your setup is best, so you can compare the two systems in terms of hardware performance.

Other than this, here's how you can find out what is really happening:

First write MWE to check logging. You should base it on your real code and use logging in a similar way, but here is a small example:

import logging logging.basicConfig(filename="test.log", level=logging.DEBUG) logger = logging.getLogger("testlogger") for i in range(0, 1000000): logger.info("iteration: %d", i) 

Then run the script under cProfile both on your development machine and in production. Be sure to log into the same file system as in your case, otherwise the results will not be applied.

 python -m cProfile testlogging.py 

You will get an output that looks like this:

 57000501 function calls in 137.072 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.000 0.000 UserDict.py:4(__init__) 1 0.000 0.000 0.000 0.000 __init__.py:1044(_fixupParents) 1 0.000 0.000 0.000 0.000 __init__.py:1085(Logger) 2 0.000 0.000 0.000 0.000 __init__.py:1100(__init__) 1 0.000 0.000 0.000 0.000 __init__.py:1112(setLevel) ..... ..... ..... ..... ..... ............................... 

This should give you an idea of ​​what causes slowness on a production machine. What to look for in particular:

  • Find the lines that read {method 'write' of 'file' objects} and {method 'flush' of 'file' objects} . This will show you how much time Python has spent writing to files and flushing data on disk — in this case, a log file. Are there significant differences between the two machines? If so, there is definitely a difference in input / output (disk) speed. Then you should take a look at setting up the drive on the server and see if there is anything you can do to improve the performance of the drive.
  • Look at the rows where the first percall column is especially large. This column is the total time spent on the function divided by the number of calls to this function. Compare between the two machines and you can find the reason for the difference.
  • Look at the rows where the tottime column tottime especially large. This column is the total time spent in the function. Again, compare the two cars, and you can find some reasons for the speed difference.

If you find that a problem with the I / O disk seems to be a problem, you can run an additional test with raw writing to files only. You can probably find a benchmarking program that allows you to test disk throughput, but you can also just write a simple C (or Python) program that writes unformatted data to a file to make sure that it is actually pure disk performance, which is the difference.

Just one final note: performance testing is like programming, a combination of art, science and technology, and although there are examples and tips that you can follow, each case requires a bit of ingenuity to crack. So try everything, make sure that you are not fooling yourself, and have fun! Good luck

+6
source

Source: https://habr.com/ru/post/920260/


All Articles