Is access speed to the MySQL database accessible mainly due to db or the language used to access it?

I need to quickly update a big bit. It may be easier to code in a scripting language, but I suspect that a C program will make updating faster. Does anyone know if there were comparative speed tests?

+6
source share
7 answers

This is not true. Update rate depends on:

  • database configuration (engine used, db configuration)
  • server hardware, especially the hard disk subsystem
  • The network range between the source and target machines.
  • amount of data transferred

I suspect that you think the scripting language will be a ferocity in this last part - the amount of data transferred.

Any scripting language will be fast enough to deliver data. If you have a large amount of data that you need to quickly parse / transform - then yes, C will definitely be the language of choice. However, if it sends simple string data to db, there is no point in doing this, although it is not difficult to create a simple C program for the UPDATE operation. It doesn't seem like it's hard to do in C, it's almost at the same level using PHP mysql_ functions in terms of complexity.

+4
source

Are you concerned about speed because you are already dealing with a situation where speed is a problem or are you planning ahead?

I can say with confidence that database interactions are usually limited by IO, network bandwidth, memory, database traffic, SQL complexity, database configuration, indexing problems and the amount of data selected is much more than choosing a scripting language against C.

When you encounter bottlenecks, they will almost always be solved with a better algorithm, smarter use of indexes, faster I / O devices, more caching ... those kinds of things (starting with algorithms).

The fourth component of LAMP is a scripting language. With fine-tuning, memcache becomes an option, as well as persistent interpreters (for example, mod_perl in a web environment, for example).

+4
source

The bulk of database transactions is on the database side. The cost of interpreting / compiling your SQL statement and evaluating the execution of a query is much more substantial than any difference that can be found in the language of what sent it.

In rare cases, the use of an application processor to work with databases is a greater factor than CPU utilization by a database server or disk speed on that server.

If your applications do not work for a long time and do not wait in the database, I would not worry about comparing them. If they need benchmarking, you have to do it yourself. The use cases of the data vary greatly and you need your own numbers.

+3
source

I heard that the C API is faster, but I have not seen any tests. To perform large database operations quickly, regardless of programming language, use stored procedures: http://dev.mysql.com/tech-resources/articles/mysql-storedprocedures.html .

The speed comes from the fact that there is a decrease in network load.

From this link:

Stored procedures are fast! Well, we cannot prove that for MySQL also, each experience will be different. Which we can say that the MySQL server takes some advantage of caching, just as prepared statements did. There is no compilation here, so a saved SQL procedure will not work as fast as a procedure written using an external language such as C. The main gain is due to reduced network traffic. If you have a recurring task that requires verification, several operators and interaction without a user, make it one call to the procedure that is stored on the server. Then there will be no messages moving back and forth between the servers and the client, for each task step.

+1
source

Since C is a lower level language, it will not have the analysis / conversion type overhead that the scripting languages ​​will have. MySQL int can directly map to C int, while PHP int contains various metadata that needs to be populated / updated.

On the other hand, if you need to do any text manipulation as part of this big update, any speed gained from C is likely to be lost in hairpulling / debugging due to its poor support for string manipulation compared to what you can make trivial ease in a scripting language like Perl or PHP.

+1
source

The C API will be a little faster, for the simple reason that any other language (regardless of whether it is a "scripting language" or a fully compiled language) will probably map from this language to the C API at some level. Using the C API directly will obviously be several tens of processor cycles faster than performing the matching operation, and then using the C API.

But this is just a spit in the ocean. Even access to the main memory is an order or two slower than the processor cycles on a modern machine, and the I / O operations (disk or network access) are several orders of magnitude slower. There is no point in optimizing to speed up sending a request for a microsecond if it takes half a second (or even a few seconds, for complex requests or checking / returning large amounts of data) for the actual execution of the request.

Choose the language in which you will be most productive, and don’t worry about choosing options for optimizing micro-optimization. Even if the language itself becomes a performance issue (which is highly unlikely), your extra performance will save more money than the cost of an additional server.

+1
source

I found that for large batches of data (Gigabytes or more), it is usually the most common way to upload data from mysql to a file or several files on the application machine. Then process it there (using your favorite tool here: Perl) and use LOAD DATA LOCAL INFILE to return it back to the new table, doing as little as possible in SQL. By doing this you must

  • remove indexes from the table before LOAD (maybe not necessary for MyISAM, but meh).

  • always, ALWAYS upload data in PK order!

  • Add indexes after loading.

Another advantage is that it is much easier to parallelize processing on a cheap machine with a bunch of quick-drying disks, rather than parallel to writing to your expensive and non-scalable database wizard.

Anyway. Large datasets usually mean that the database is a bottleneck.

0
source

Source: https://habr.com/ru/post/888591/


All Articles