Dynamic query optimization

I have a business assignment that mainly deals with retrieving data from a database (Microsoft SQL Server 2008). In the process, users will be able to choose which columns to choose, choose which view to choose, and build a WHERE clause. Based on what the user selects, the SQL query is constructed accordingly. The requirement is that the user can select ANY column from ANY representation and filter by any column in the WHERE clause. The company does not want the solution to use data warehouse / OLAP, and wants to limit any third-party software. Therefore, basically they just want to use the .NET Windows Forms application, which dynamically creates SQL queries based on the graphical interface and connects to the database.

My problem is how to optimize queries. I in no way can optimize SQL queries, but at first I thought: what if the user decides to filter a column that does not have an index (in the WHERE clause)? By giving the user such flexibility, they can potentially create queries that are so inefficient that they take a long time to complete.

I understand that performance cannot be good with a lot of data if they filter columns that don't have indexes, but is there anything I can do to improve it? Of course, I cannot just add indexes to all columns.

I'm not necessarily just looking for query optimization, but I also think if there are any server settings that I can do, such as caching? Iโ€™m mostly ears and Iโ€™m looking for any advice that can help me improve my performance.

Any suggestions?

Thank you in advance!

+6
source share
2 answers

You really can't do much, except what users are likely to do. You are in a good position for the SQL Server optimizer to do the hard work for you (assume this is a build in the keystore)!

I would create indexes for the most likely columns to be filtered or sorted. You should try to filter these indices for non-zero values, which will reduce the cost of storage (assuming that users will not filter zero values).

You can also try to precompile common joins and aggregations using indexed views. If you want to drop insane amounts of RAM on this issue and are ready to have slow records, you can index and materialize hell from this database.

Finally, you can offload user requests in the target read-only log delivery or the like. This will cause their terrible requests.

For your queries, you need to perform d-parameterization, but you do not need to cache them in all cases. If your queries are expensive (therefore, compilation time is not significant), you will want to run them using OPTION RECOMPILE so that SQL Server can adapt to the exact runtime values โ€‹โ€‹of all parameters.

You must also keep track of all requests and view them in order to search for patterns. Your users are likely to run very similar queries all the time. Index for them.

Run sp_updatestats regularly.

Finally, I want to say that there is no very effective solution to this, because if SQL Server implemented them itself, so that everyone could benefit.

+4
source

First, to improve SQL Server's ability to optimize, cache, and compile queries / statements

  • Make sure the user interface supports IN and BETWEEN, allowing users to create their own WHERE clause.
  • Sort your AND or OR conditions so that the columns are indexed first and then the alphabetical order of the other columns.
    • If you allow nested AND and OR in a WHERE clause, this can be trickier.
  • Use * parameterized queries "
WHERE C1 = 'foo' AND C3 = 'bar' AND C2 = 42 -- if C3 is an indexed column then WHERE C3 = @parm1 AND C1 = @parm2 AND C2 = @parm3 

Secondly, to provide users with the opportunity

  • When listing the columns that the user can select, first list the indexed columns or make them the recommended columns.
  • Create some record of the columns that users select and the time that their query takes to complete. Having this information will help you set up your database in the future and improve your user experience.

EDIT OR โ†’ AND OR OR in relation to Martin Smith's comment, this is called Short Circuit.

Consider the logic

 A = True OR B = True OR C = True 

If A is truly true, there is no need to evaluate B or C for the condition to be true.

 A = True AND B = True AND C = True 

In this case, if A is False, there is no need to convince B or C for the condition to be false.

+1
source

Source: https://habr.com/ru/post/905824/


All Articles