Request cost and execution speed + Parallelism

My department was recently reprimanded (beautifully) by our IT department for fulfilling queries at very high costs on the grounds that our queries have a real opportunity to destabilize and / or crash the database. None of us are a database administrator; there were only researchers who write and execute database queries, and I'm probably the only one who has ever looked at an explanation plan before reprimanding.

We were told that requests costing over 100 should be very rare, and requests with costs over 1000 should never be fulfilled. The problems I am facing is that the cost does not seem to correlate with the runtime, and I lose performance when trying to optimize my queries.

As an example, I have a query that runs in less than 5 seconds with a value of 10844. I rewrote the query to use a view that contains most of the information I need, and got a cost of up to 109, but a new request that gets the same results, takes 40 seconds. I found a question here with a possible explanation:

Measurement of query performance: "Cost of execution plan request" vs "Time Taken"

This question led me to the tips of parallelism. I tried using /*+ no_parallel*/ in a request costing 10884, but the cost has not changed and has not been executed, so I'm not sure if parallelism is an explanation for faster execution time, but higher cost. Then I tried using the hint /*+ parallel(n)*/ and found that the higher the value of n , the lower the cost of the request. In the case of a cost request of 10844, I found that /*+ parallel(140)*/ reduced the cost to 97, with a very slight increase in runtime.

This seemed like the perfect cheat to meet the requirements set by our IT department, but then I read this:

http://www.oracle.com/technetwork/articles/datawarehouse/twp-parallel-execution-fundamentals-133639.pdf

The article contains the following sentence:

Parallel execution can allow one operation to use all system resources.

So my questions are:

Do I really put more load on server resources using the hint /*+ parallel(n)*/ with a very high degree of parallelism, although I am lowering the cost?

Assuming there is no parallelism, is execution speed a better measure of resources used than cost?

+5
source share
1 answer

The rule that the database administrator gave you does not make much sense. Worrying about the cost reported for the request is very rarely productive. Firstly, you cannot directly compare the cost of two different requests: one request, which has a cost in millions, can work very quickly and consume very few system resources. Another request, which costs hundreds, can work for several hours and bring the server to its knees. Secondly, cost is an estimate. If the optimizer made an accurate estimate of the cost, this strongly means that it has developed an optimal query plan, which will mean that it is unlikely that you can modify the request to return the same results using fewer resources. If the optimizer made an inaccurate estimate of the cost, this pretty much means that he had a bad query plan, in which case the reported value would not be related to any useful metric that you came up with. In most cases, the queries you are trying to optimize are those in which the optimizer generated the wrong query plan because it incorrectly estimated the cost of the various steps.

Bypassing the optimizer with hints that may or may not actually change the query plan (depending on how parallelism is configured, for example), is unlikely to solve the problem - it is much more likely to lead to optimizer evaluations to be less accurate and more likely. that he chooses a query plan that consumes a lot more resources than necessary. A parallel hint with a high degree of parallelism, for example, would tell Oracle to drastically reduce the cost of a full table scan, making it more likely for the optimizer to choose this for index scanning. This is rarely what your database administrators would like to see.

If you are looking for one metric that tells you a reasonable query plan, I would use the number of logical I / O operations. Logical I / O correlates very well with the actual query performance and the amount of resources that your request consumes. Considering runtime can be problematic because it varies significantly depending on what data will be cached (therefore, requests are often executed much faster during execution), while logical I / O does not change depending on what data is in the cache . It also allows you to scale your expectations as the number of rows your requests should handle changes to. If you write a query that should collect data from 1 million rows, for example, it should consume much more resources than a query that needs to return 100 rows of data from a table without aggregation. If you're looking for logical I / O, you can easily scale your expectations to the size of the problem to find out how efficient your queries are.

In Christian Antognini's Oracle Performance Troubleshooting "(p. 450), for example, it gives an empirical rule that is pretty reasonable

  • 5 logical reads per row that is returned / aggregated, probably very good
  • 10 logical reads for each row that is returned / aggregated is probably adequate
  • 20+ logical reads for each row that is returned / aggregated is probably inefficient and needs to be tuned

Different systems with different data models may deserve a careful selection of buckets, but these are probably good starting points.

I assume that if you are researchers who are not developers, you are probably working with queries that need to collect or retrieve relatively large datasets, at least compared to those usually written by application developers. If you scan a million rows of data to generate some generalized results, your queries will naturally consume far more resources than the application developer, whose queries read or write a few rows. Perhaps you are writing queries that are just as efficient in terms of logical I / O per line, you can just look at many other lines.

If you fulfill queries to a real production database, you may well be in a situation where it makes sense to begin segregation of workload. Most organizations get to the point where running real-time reporting queries to the database begins to create problems for the production system. One common solution to this problem is to create a separate report database that comes from the production system (either through a night snapshot or through a continuous replication process), where report requests can be executed without disrupting the production application. Another common solution is to use something like Oracle Resource Manager to limit the amount of resources available to one group of users (in this case, report developers) in order to minimize the impact on users with higher priority (in this case, users of the product system).

+6
source

Source: https://habr.com/ru/post/1201766/


All Articles