The query optimizer estimates that the connection results will only have one row when the actual number of rows is 2000. This causes subsequent joins in the dataset to have an estimated single row result when some of them reach 30,000.
When counting 1, QO chooses a loop / index search strategy for many of the joins that is too slow. I worked on the problem, holding back possible pooling strategies with WITH OPTION (HASH JOIN, MERGE JOIN) , which improved the overall execution time from 60 minutes to 12 seconds. However, I think that QO is still creating a less optimal plan due to poor ranks. I donβt want to specify the connection order and data manually - there are too many requests that were affected because it would be useful.
This is in Microsoft SQL Server 2000, a medium query with several table selections connected to the main selections.
I think that QO can overestimate the power of many sides in a join, expecting columns to join between tables to have fewer rows.
Estimated row counts from index scan to join are accurate; this is only the estimated number of rows after some joins that are too low.
Statistics for all tables in the database are updated and updated automatically.
One of the earliest unsuccessful associations is between the general table "Person" for information common to all people, and the table of specialized persons, to which about 5% of all these people belong. The clustered PK in both tables (and the join column) is INT. The database is very standardized.
I believe that the root problem is estimating the number of failed rows after certain joins, so my main questions are:
- How can I fix the QO connection string string estimate?
- Is there a way I can hint that a connection will contain many lines without specifying the entire order of the connection manually?
source share